Building Custom Data Importers: What Engineers Need to Know

❥ Sponsor

Importing data is a common pain-point for engineering teams. Whether its importing CRM data, inventory SKUs, or customer details, importing data into various applications and building a solution for this is a frustrating experience nearly every engineer can relate to. Data import, as a critical product experience is a huge headache. It reduces the time to value for customers, strains internal resources, and takes valuable development cycles away from developing key, differentiating product features.

Frequent error messages end-users receive when importing data. Why do we expect customers to fix this themselves?

Data importers, specifically CSV importers, haven’t been treated as key product features within the software, and customer experience. As a result, engineers tend to dedicate an exorbitant amount of effort creating less-than-ideal solutions for customers to successfully import their data.

Engineers typically create lengthy, technical documentation for customers to review when an import fails. However, this doesn’t truly solve the issue but instead offsets the burden of a great experience from the product to an end-user.

In this article, we’ll address the current problems with importing data and discuss a few key product features that are necessary to consider if you’re faced with a decision to build an in-house solution.

Importing data is typically frustrating for anyone involved at a data-led company. Simply put, there has never been a standard for importing customer data. Until now, teams have deferred to CSV templates, lengthy documentation, video tutorials, or buggy in-house solutions to allow users the ability to import spreadsheets. Companies trying to import CSV data can run into a variety of issues such as:

  • Fragmented data: With no standard way to import data, we get emails going back and forth with attached spreadsheets that are manually imported. As spreadsheets get passed around, there are obvious version control challenges. Who made this change? Why don’t these numbers add up as they did in the original spreadsheet? Why are we emailing spreadsheets containing sensitive data?
  • Improper formatting: CSV import errors frequently occur when formatting isn’t done correctly. As a result, companies often rely on internal developer resources to properly clean and format data on behalf of the client — a process that can take hours per customer, and may lead to churn anyway. This includes correcting dates or splitting fields that need to be healed prior to importing.
  • Encoding errors: There are plenty of instances where a spreadsheet can’t be imported when it’s not improperly encoded. For example, a company may need a file to be saved with UTF-8 encoding (the encoding typically preferred for email and web pages) in order to then be uploaded properly to their platform. Incorrect encoding can result in a lengthy chain of communication where the customer is burdened with correcting and re-importing their data.
  • Data normalization: A lack of data normalization results in data redundancy and a never-ending string of data quality problems that make customer onboarding particularly challenging. One example includes formatting email addresses, which are typically imported into a database, or checking value uniqueness, which can result in a heavy load on engineers to get the validation working correctly.

Remember building your first CSV importer?

When it comes down to creating a custom-built data importer, there are a few critical features that you should include to help improve the user experience. (One caveat – building a data importer can be time-consuming not only to create but also maintain – it’s easy to ensure your company has adequate engineering bandwidth when first creating a solution, but what about maintenance in 3, 6, or 12 months?)

A preview of Flatfile Portal. It integrates in minutes using a few lines of JavaScript.

Data mapping

Mapping or column-matching (they are often used interchangeably) is an essential requirement for a data importer as the file import will typically fail without it. An example is configuring your data model to accept contact-level data. If one of the required fields is “address” and the customer who is trying to import data chooses a spreadsheet where the field is labeled “mailing address,” the import will fail because “mailing address” doesn’t correlate with “address” in a back-end system. This is typically ‘solved’ by providing a pre-built CSV template for customers, who then have to manipulate their data, effectively increasing time-to-value during a product experience. Data mapping needs to be included in the custom-built product as a key feature to retain data quality and improve the customer data onboarding experience.

Auto-column matching CSV data is the bread and butter of Portal, saving massive amounts of time for customers while providing a delightful import experience.

Data validation

Data validation, which checks if the data matches an expected format or value, is another critical feature to include in a custom data importer. Data validation is all about ensuring the data is accurate and is specific to your required data model. For example, if special characters can’t be used within a certain template, error messages can appear during the import stage. Having spreadsheets with hundreds of rows containing validation errors results in a nightmare for customers, as they’ll have to fix these issues themselves, or your team, which will spend hours on end cleaning data. Automatic data validators allow for streamlining of healing incoming data without the need for a manual review.

We built Data Hooks into Portal to quickly normalize data on import. A perfect use-case would be validating email uniqueness against a database.

Data parsing

Data parsing is the process of taking an aggregation of information (in a spreadsheet) and breaking it into discrete parts. It’s the separation of data. In a custom-built data importer, a data parsing feature should not only have the ability to go from a file to an array of discrete data but also streamline the process for customers.

Data transformation

Data transformation means making changes to imported data as it’s flowing into your system to meet an expected or desired value. Rather than sending data back to users with an error message for them to fix, data transformation can make small, systematic tweaks so that the users’ data is more usable in your backend. For example, when transferring a task list, prioritization data could be transformed into a different value, such as numbers instead of labels.

Data Hooks normalize imported customer data automatically using validation rules set in the Portal JSON config. These highly adaptable hooks can be worked to auto-validate nearly any incoming customer data.

We’ve baked all of the above features into Portal, our flagship CSV importer at Flatfile. Now that we’ve reviewed some of the must-have features of a data importer, the next obvious question for an engineer building an in-house importer is typically… should they?

Engineering teams that are taking on this task typically use custom or open source solutions, which may not adhere to specific use-cases. Building a comprehensive data importer also brings UX challenges when building a UI and maintaining application code to handle data parsing, normalization, and mapping. This is prior to considering how customer data requirements may change in future months and the ramifications of maintaining a custom-built solution.

Companies facing data import challenges are now considering integrating a pre-built data importer such as Flatfile Portal. We’ve built Portal to be the elegant import button for web apps. With just a few lines of JavaScript, Portal can be implemented alongside any data model and validation ruleset, typically in a few hours. Engineers no longer need to dedicate hours cleaning up and formatting data, nor do they need to custom build a data importer (unless they want to!). With Flatfile, engineers can focus on creating product-differentiating features, rather than work on solving spreadsheet imports.

Importing data is wrought with challenges and there are several critical features necessary to include when building a data importer. The alternative to a custom-built solution is to look for a pre-built data importer such as Portal.

Flatfile’s mission is to remove barriers between humans and data. With AI-assisted data onboarding, they eliminate repetitive work and make B2B data transactions fast, intuitive, and error-free. Flatfile automatically learns how imported data should be structured and cleaned, enabling customers and teams to spend more time using their data instead of fixing it. Flatfile has transformed over 300 million rows of data for companies like ClickUp, Blackbaud, Benevity, and Toast. To learn more about Flatfile’s products, Portal and Concierge, visit flatfile.io.

Direct Link →