How to create your own dataset in Excel? 🧮
Have you just started your journey with Power BI?
Are you looking for a dataset that you can use in or connect Power BI to without the fear of slowing down the main database?
Or just looking for ways of improving your skills?
If you answered yes to any of the above questions, I think today’s blog post and video will prove extremely helpful 💡
Still, to this day there are certain reports where instead of using LIVE data I prefer to start with Dummy or Test data. Reasons can vary, but in general, it is good to have full control over the dataset. This way I can create outliers, validate my measures, adjust the report template, and heaps more.
But in a LIVE environment, we do not want to do all of that – and hopefully, we can’t.
How can we solve this?
So, from time to time we will need to have a dataset that can be shapes based on the requirements. At those times we can easily create an Excel-based STAR schema and import those files or sheets to Power BI.
This option gives us greater flexibility and a perfect environment to fool around in Power BI.
I usually start with my Dimension tables (point ends of the star) and come up with random details. In the below video I created Product, Customer and Warehouse tables, and all of them are based on what I needed at the time.
*Of course, given that this is a DEMO video, I wanted to show your that only your imagination could stop you from creating even bigger Dimension tables.
In general, these Dimension tables are wider than longer, meaning they have more columns than rows, and only contain a limited number of rows. Therefore, whenever I use this Dummy Data creation, I only create a handful of records. It’s also good to limit the number of records, so we don’t have to sit in front of the computer trying to come up with product or customer names 🥱
Once all Dimension tables are finalized, we can turn to our FACT table – in this example a Sales Report.
This is where all the magic happens – or shall I say all the RANDOM stuff happens 🧙♂️
Why should we use Excel?
In Excel, we can utilize the RANDBETWEEN function (and in some cases RAND as well) to create randomized values based on a range. This is great, as we have just created our IDs in our Dimension tables, so we can feed the min and max values into RANDBETWEEN. It will ensure that our FACT table only contains values that have corresponding details in our Dimension tables. So creating relationships to link them together is a walk in a park.
Once the above steps are done, we can head over to Power BI, import these tables, and start practicing. Whether it is about creating a proof of concept for our report, practicing data visualization or DAX, or just general playing around in Power BI we have a dataset that is 100% flexible and ready to use.
I consider this another great way of connecting the two tools (Excel and Power BI) and one way of utilizing the power of the suit of BI tools from Microsoft.
What do you think? Let me know down in the comment section here or under the video if you used this method before.
🙏Thanks for watching and if you liked this video please hit the 👍 button.
📢 Also be sure to subscribe to the channel so you never miss a new video.