Here are some key points about data in a Lakehouse in Microsoft Fabric:
Data Ingestion: There are multiple ways to get data into a Lakehouse in Microsoft Fabric, such as connecting to existing SQL Server and copying data into delta table on the Lakehouse, uploading files from your computer, copying and merging multiple tables from other Lakehouses into a new delta table, connecting to a streaming source to land data in a Lakehouse, and referencing data without copying it from other internal Lakehouses or external sources. You can also use file upload from a local computer, run a copy tool in pipelines, set up a dataflow, or use Apache Spark libraries in notebook code. The approach you choose to load data will depend on the use case, for example, for small file upload from a local machine, you can use Local file upload; for large data source, you can use Copy tool in pipelines; and for complex data transformations, you can use Notebook code.
Lakehouse Architecture: A Lakehouse in Fabric can be implemented either as a lakehouse or data warehouse architecture, or you can combine these two together. The architecture uses a medallion model where the bronze layer has the raw data, the silver layer has the validated and deduplicated data, and the gold layer has highly refined data. A lakehouse implementation involves creating a Fabric workspace, creating a Lakehouse, ingesting data, transforming data, and loading it into the Lakehouse. You can load data from the bronze, silver, and gold zones as delta lake tables. Additionally, you can connect to your Lakehouse using TDS/SQL endpoint and create a Power BI report to analyze data. You can also orchestrate and schedule data ingestion and transformation flow with a pipeline.
Data Flow and Transformation: In a Lakehouse in Microsoft Fabric, data is ingested from various data sources using connectors integrated into the Fabric pipeline. The data is then transformed and stored in a standardized Delta Lake format, which allows all the Fabric engines to access and manipulate the same dataset stored in OneLake without duplicating data. Data transformation can be done using either pipelines/dataflows or notebook/Spark for a code-first experience. After transformation, the data can be consumed using Power BI for reporting and visualization. Each Lakehouse has a built-in TDS/SQL endpoint for easy connectivity and querying of data in the Lakehouse tables from other reporting tools.
Please note that I could not find the specific step-by-step guide to load data into a Lakehouse due to time constraints, but I can continue the search if you would like more information.