DATA SPECIFICATIONS
Nextpoint is a very flexible platform, and between the platform and our internal team, it’s rare that we’ll come across data that we just simply can’t work with.
That being said, anything that is not considered a “standard” import and/or data migration may incur additional service hours for our Engagement Data Services team to make any modifications necessary for successful ingestion into the Nextpoint platform.
Native Data Ingestion Specifications
Nextpoint can ingest PSTs, MBox, and many other loose data files (Supported File Types). Data importing is not always a straightforward process given the unique nature of each data set and size of the data. Nextpoint will process your native data and require a data consult for any data sets over 200GBs.
What Is Not Included In Data Import Services
- Downloading data from 3rd party FTP sites
- Oversight of the organization of a database’s structure post-import
- Repair of PSTs or corrupt files
- Imaging services for files that do not image in Nextpoint
- Custom deduplication of data (deduplication of data beyond standard Nextpoint deduplication software settings)
- Image, text, and/or native replacements for documents that have already been imported into Nextpoint
- Work-product or coding overlays to apply to documents post-import
- File conversions for file types not supported by Nextpoint
- Extraction of attachments from PDF portfolios
Legacy + Produced Data Specifications
Produced/Migrated data needs to meet all requirements listed below to be considered “standard”:
- The data is already exported from the 3rd party site and ready for import. If you would like assistance with downloading your data from another source, Nextpoint can help and will provide a quote.
- Each document has a document level pdf image or per-page TIF/TIFFs (for B&W images) and/or JPG/JPEGs (for color images)
- The images must be uniquely named with a Bates or DocID number
- For per-page images, all image files must have a consistent number of characters (meaning, if one page of a document has a per-page suffix such as “_0001”, all pages must also have a per-page suffix)
- For document level pdfs, if the documents have an identifiable document level Bates/DocId scheme, in the absence of a load file, Nextpoint can assign Bates upon import as a convenience, otherwise they will be imported as individual loose documents (named by the following naming hierarchy: 1. Subject/Title, 2. Original File Name, 3. Else = “Untitled”)
- If the production/migration set includes a load file:
- The load file must be a standard dat, csv, or txt file with standard delimiters (columns)
- If the production/migration is imaged at a per-page level, the load file must include a bates_start/bates_end or image_range_start/image_range_end (or equivalent) for each document
- If email family information is included within the load file (IDs that allow parent emails to be associated with their attachments), it must be in bates_start/begattach or docid/parentid format (or equivalent)
- If the production/load file would result in more than 100,000 documents within the production, additional service hours may be required to split the production into manageable batches
- Any search/OCR text must be provided on a document level (if there are no page breaks included within the text file, all search text will be treated as though it fell on the first page of a document). The absence of any search/OCR text will initiate OCRing upon import by the Nextpoint application.
- All search/OCR text must be referenced within the load file with a document level relative path that matches the folder/file structure of the production/migration including exact file name matching (for example, TEXT/TEXT001/EXMPL00001.txt)
- Any natives included must also be referenced within the load file with a document level relative path that matches the folder/file structure of the production/migration including exact file name matching (for example, NATIVES/NATIVE001/EXMPL00001.xlsx)
- As it relates to migrated data, if there are any documents that have redactions and/or highlights that need to retain both their clean and annotated versions within Nextpoint, both versions of the images must be provided (named consistently by Bates/DocId so it is easily recognizable which type is which)