[postgresql] CSV 파일에서 헤더가있는 PostgreSQL 테이블로 CSV 파일을 복사하는 방법은 무엇입니까?

Question 1

CSV 파일을 Postgres 테이블에 복사하고 싶습니다. 이 테이블에는 약 100 개의 열이 있으므로 필요하지 않은 경우 다시 작성하고 싶지 않습니다.

\copy table from 'table.csv' delimiter ',' csv;명령을 사용하고 있지만 테이블이 생성되지 않으면 ERROR: relation "table" does not exist. 빈 테이블을 추가해도 오류가 발생하지 않지만 아무 일도 일어나지 않습니다. 이 명령을 두세 번 시도했는데 출력이나 메시지가 없었는데 PGAdmin을 통해 확인했을 때 테이블이 업데이트되지 않았습니다.

내가하려는 것처럼 헤더가 포함 된 테이블을 가져 오는 방법이 있습니까?

Question 2

이것은 효과가 있었다. 첫 번째 행에는 열 이름이 있습니다.

COPY wheat FROM 'wheat_crop_data.csv' DELIMITER ';' CSV HEADER

Question 3

Python 라이브러리 pandas를 사용하면 열 이름을 쉽게 만들고 csv 파일에서 데이터 유형을 추론 할 수 있습니다.

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('postgresql://user:pass@localhost/db_name')
df = pd.read_csv('/path/to/csv_file')
df.to_sql('pandas_db', engine)

if_exists매개 변수, 예를 바꾸거나 기존 테이블에 추가하도록 설정할 수 있습니다 df.to_sql('pandas_db', engine, if_exists='replace'). 이것은 추가 입력 파일 유형, 여기 및 여기 문서에서도 작동합니다 .

Question 4

허가없이 터미널에 의한 대안

NOTES의에서 페이지 문서
말

경로는 클라이언트의 작업 디렉토리가 아니라 서버 프로세스의 작업 디렉토리 (일반적으로 클러스터의 데이터 디렉토리)에 상대적으로 해석됩니다.

따라서 전체적으로 psql또는 다른 클라이언트를 사용 하거나 로컬 서버에서도 문제가 있습니다 … 그리고 다른 사용자를 위해 COPY 명령을 표현하는 경우 예를 들어. Github README에서 독자는 문제가 있습니다 …

클라이언트 권한으로 상대 경로 를 표현하는 유일한 방법 은 STDIN을 사용하는 것 입니다 .

STDIN 또는 STDOUT을 지정하면 클라이언트와 서버 간의 연결을 통해 데이터가 전송됩니다.

여기에 기억 된 대로 :

psql -h remotehost -d remote_mydb -U myuser -c \
   "copy mytable (column1, column2) from STDIN with delimiter as ','" \
   < ./relative_path/file.csv

Question 5

나는이 기능을 문제없이 한동안 사용하고있다. csv 파일에있는 숫자 열만 제공하면 첫 번째 행에서 헤더 이름을 가져 와서 테이블을 만듭니다.

create or replace function data.load_csv_file
    (
        target_table  text, -- name of the table that will be created
        csv_file_path text,
        col_count     integer
    )

    returns void

as $$

declare
    iter      integer; -- dummy integer to iterate columns with
    col       text; -- to keep column names in each iteration
    col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    set schema 'data';

    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format ('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format ('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_file_path);

    iter := 1;
    col_first := (select col_1
                  from temp_table
                  limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format ('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format ('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row // using quote_ident or %I does not work here!?
    execute format ('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length (target_table) > 0 then
        execute format ('alter table temp_table rename to %I', target_table);
    end if;
end;

$$ language plpgsql;

Question 6

기본 DB 가져 오기 명령을 사용하기 때문에 테이블을 생성하고 pd.to_sql ()보다 빠른 d6tstack 을 사용할 수 있습니다 . Postgres와 MYSQL 및 MS SQL을 지원합니다.

import pandas as pd
df = pd.read_csv('table.csv')
uri_psql = 'postgresql+psycopg2://usr:pwd@localhost/db'
d6tstack.utils.pd_to_psql(df, uri_psql, 'table')

또한 여러 CSV 가져 오기, 데이터 스키마 변경 해결 및 / 또는 db에 쓰기 전에 팬더 (예 : 날짜)로 전처리하는 데 유용합니다. 자세한 내용은 예제 노트북을 참조하십시오.

d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'),
    apply_after_read=apply_fun).to_psql_combine(uri_psql, 'table')

[postgresql] CSV 파일에서 헤더가있는 PostgreSQL 테이블로 CSV 파일을 복사하는 방법은 무엇입니까?

답변

답변

답변

허가없이 터미널에 의한 대안

답변

답변

답변