Import CSV data into your MongoDB Database

  • 1
    Launch the ETL Designer

    Login to NodeChef. From the NodeChef Task Manager, click on DB actions > CSV / JSON import to launch the designer

  • 2
    Choose the file to import
    • If you are uploading the data, click on the Local file system option and then choose the file from your file system. The file cannot exceed 192 megabytes in size but you can however compress the file using gzip or bzip before uploading. This will allow you to bypass this limitation.
    • You can also specify the HTTP(s) or FTP(s) uri to the file. The ETL engine will automatically download the file and import the data. The Content Length of the response from the server hosting the file cannot exceed 192 megabytes in size. You can compress the file on the server using gzip or bzip.
      If the file is hosted on S3 or GCS, you will have to enable direct access for the file or enable anonymous users to access the file before providing the link to the file.
  • 3
    Select input file compression method

    If you compressed the file you are attempting to import, you can select the compression method used. Currently we support gzip or bzip

  • 4
    Specify the encoding of the file
    If the file you are importing has the byte order mark (BOM) at the beginning, you can simply ignore this step. Else you must specify the right encoding of the file.

    We currently support ASCII, ISO/IEC 8859, UTF-8, UTF-16 (Little endian), UTF-16 (Big endian).

    If you do not know the character encoding, you can select UTF-8 because of its high frequency of usage and preview the data. It is very less likely the character encoding of your data file is UTF-16. This option is only provided for special use cases.

  • 5
    Select the file format

    If the file you are importing is a CSV file, you can select the CSV option. Under the options, if the first row in the CSV file contains the column name, you must select the "Consider first row as column headers" option.

    In some cases the file you are importing is not delimetered by a comma but instead a tab and other characters. For this use case, select Flatfile instead. You can then provide the enter the column and row delimeter under options.

  • 6
    Provide the name of the collection in which you want to insert this data.

    If the collection does not exist in the target database, the collection will be created. Collection names are case sensitive. The update object in database if _id exist option result in the slowest operation as bulk update operations cannot be performed in this case. Each object has to be updated individually.

    An _id field will be automatically generated for each row if there is no column in the CSV with name _id

  • 7
    Optional Column mapping and data types.

    If you skip this section, all the columns in the CSV will be imported as a string. However in many cases this is undesirable. NodeChef offers the ability to specify the data types, performing column mappings as well as tranforming the data however you want.

    Vist the documentation on CSV transformations to learn more.

  • 8
    Preview and Import

    Use the preview button to preview the data to be inserted, Up to 32 rows will be showed for the preview. Once satisfied with the preview, you can use the import button to import the data.