Import JSON data into your MongoDB database

  • 1
    Launch the ETL Designer

    Login to NodeChef. From the NodeChef Task Manager, click on DB actions > CSV / JSON import to launch the designer

  • 2
    Choose the file to import
    • If you are uploading the data, click on the Local file system option and then choose the file from your file system. The file cannot exceed 192 megabytes in size but you can however compress the file using gzip or bzip before uploading. This will allow you to bypass this limitation.
    • You can also specify the HTTP(s) or FTP(s) uri to the file. The ETL engine will automatically download the file and import the data. The Content Length of the response from the server hosting the file cannot exceed 192 megabytes in size. You can compress the file on the server using gzip or bzip.
      If the file is hosted on S3 or GCS, you will have to enable direct access for the file or enable anonymous users to access the file before providing the link to the file.
  • 3
    Select input file compression method

    If you compressed the file you are attempting to import, you can select the compression method used. Currently we support gzip or bzip

  • 4
    Specify the encoding of the file
    If the file you are importing has the byte order mark (BOM) at the beginning, you can simply ignore this step. Else you must specify the right encoding of the file.

    We currently support ASCII, ISO/IEC 8859, UTF-8, UTF-16 (Little endian), UTF-16 (Big endian).

    If you do not know the character encoding, you can select UTF-8 because of its high frequency of usage and preview the data. It is very less likely the character encoding of your data file is UTF-16. This option is only provided for special use cases.

  • 5
    Select the file format

    Select the JSON file format. The input file must be structured using one of the below formats. If the JSON data was obtained from a mongoDB dump or your NodeChef backups, you should select the mongoDB extended JSON checkbox under options.

    • An array of objects. That is, the first non white space character in the file is a opening square bracket and the last non white space character is a closing bracket. The objects within the array are the target documents to be inserted in the database.
      [ { ... }, { ... }, { ... } ]
    • A series of JSON objects. A newline seperator between objects is not a requirement but often included by export tools such as mongo dump and most popular mongo IDE's
      { ... } { ... } { ... }
    • An array attribute containing JSON documents. The array attribute is referred to us the data attribute see the options section. This format is typically returned by web services and some REST based data stores. For the below example, the data attribute is the values array. All objects in the values array will be imported into the database.
      { ok : 1, values : [ { ... }, { ... } ] }
  • 6
    Provide the name of the collection in which you want to insert this data.

    If the collection does not exist in the target database, the collection will be created. Collection names are case sensitive. The update object in database if _id exist option result in the slowest operation as bulk update operations cannot be performed in this case. Each object has to be updated individually.

    You can also tranform the incoming JSON documents to a different structure if required. Vist the documentation on CSV transformations to learn more.

  • 7
    Preview and Import

    Use the preview button to preview the data to be inserted, Up to 32 rows will be showed for the preview. Once satisfied with the preview, you can use the import button to import the data.