Quick Tip: Seeding Large Data in Entity Framework Core 2.1
Populating small tables isn't a problem with Entity Framework Core, but what about large amounts of data? Today, I show how to feed large data into your entities.
Seeding has always been a problem when it comes to Entity Framework.
Where do you place your seeding code? In your Startup.cs? In a helper class?
Well, the Entity Framework team has done it again. The Entity<T> class has a method called HasData()
. This HasData method allows you to pass in either a single entity of T or a list of T entities to automatically populate your table.
Julie Lerman also has an absolutely awesome deep dive on HasData Seeding for Entity Framework. Definitely a must read!
For small lookup tables, this is great, but what if you have a LOT of data required for the application to function properly?
Use the [Re]source, Luke!
I remember using Resource files a long time ago. How hard it was to create one, build one, and maintain it. It was excruciatingly painful.
Maybe it was my lack of using them or the tools became better, but I didn't like them at all.
Now they're cool again.
They're now easier to work with and provide a huge benefit for storing default data in the web app.
Perfect for Internationalization and now, seed data.
Let's create one.
- Go to your ASP.NET Core (or WebForms or MVC app) project.
- Add a new Resource.resx file and call it whatever you want (for example, MyAppResources.resx).
- The first column (Name) is the name you'll refer to in your web app when initially grabbing data (for example, call it StateXml)
- In the second column, this is where you place the contents of your seed data (Value column)
- Finally, at the top of the grid near the middle, there is an Access Modifier. Change it to Public. This will allow you to access the resource.
With SQL Server 2014 and higher, you can go to SQL Server Management Studio (SSMS) and type either:
SELECT * FROM States FOR XML AUTO
or
SELECT * FROM States FOR JSON AUTO
This will give you an XML or JSON version of your data for your Resource file.
Once you have your resource file saved, we can add the following three lines in our DbContext (or wherever it makes sense for your application). The best place to put this is in your OnModelCreating in your DbContext for your Entities because of the ModelBuilder reference.
(The example below is based on the listed scenario from above).
// State Items var jsonStateList = MyAppResources.StateXml; var states = JsonConvert.DeserializeObject<List<State>>(jsonStateList); modelBuilder.Entity<State>().HasData(states);
I know I'm using Json for my example, but you can easily use XML instead.
If you want to learn more about serializing/deserializing objects using XML, check out my Serialization/Deserialization Series.
Ahem...What about a script?
I know what you're thinking...why not place a SQL Script inside the Resource file instead?
Two reasons:
- A script has too much cruft. While I admit, it would be easier with a script generated from SQL Server, but it would be bloated with a lot of 'INSERT INTO' statements. All we want in our Resource file is pure, serialized data.
- We want to keep our entity tables granular. If a couple of tables were not populated, I don't want to run an entire script for a couple of empty tables.
Also, we could just run a SQL Script, but lately, Entity Framework is becoming more code-centric with it's code-first approach where even the database changes are reflected in code per EF Migrations.
*Bonus Benefit*
Did you know if you have this data in a Resource file, you have a ready-to-go In-Memory database for your unit tests.
Start up your unit tests and prepare the DbContext to include all of your tables and you can run everything in-memory.
It's absolutely one of the fastest ways to mock a database.
Conclusion
This quick tip shows you how to take a large amount of data and prepopulate your tables at runtime. SQL Scripts are helpful, but versioning the code (or data in this case) is more important.
How do you seed large amounts of data in your application? SQL Script? Code? cUrl? Post your comments below and let's discuss this approach!