SQL 2016 In-Memory OLTP 可以提高交易效能超過15~30倍以上,
實務上我們可能會有大型檔案資料匯入SQL Server需求,假設效能頻頸發生在目的資料表,
這篇我將透過SSIS並使用BDD,模擬比較傳統資料表和In-Memory資料表在ETL效能上的差異。
Balanced Data Distributor(BDD)主要利用多CPU平行能力提高轉換效能,他會將在緩衝區中的資料,
平均分布在每個輸出的執行緒(RR方式),但如果輸入資料量太小,反而效能會更差,另外也不保證資料排序。
truncate table rsa241
CREATE TABLE [dbo].[rsa241](
[num] [int] NOT NULL,
[stime] [nchar](8) NOT NULL,
[event_code] [nvarchar](4) NULL,
[event_des] [nvarchar](50) NULL,
[event_card] [nvarchar](10) NULL,
[door_name] [nvarchar](20) NULL,
[device_id] [nvarchar](20) NULL,
[site_id] [nvarchar](12) NULL,
[ip] [nvarchar](16) NULL,
[device_name] [nvarchar](20) NULL,
[ncuip] [nvarchar](16) NULL,
[emp_id] [nvarchar](20) NULL,
[sdate] [nvarchar](10) NULL,
[emp_dep] [nvarchar](12) NULL,
CONSTRAINT [PK_rsa241] PRIMARY KEY CLUSTERED
(
[num] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
select * from [rsa241]
drop table rsa241inmemory
CREATE TABLE [dbo].[rsa241inmemory](
[num] [int] NOT NULL,
[stime] [nchar](8) NOT NULL,
[event_code] [nvarchar](4) NULL,
[event_des] [nvarchar](50) NULL,
[event_card] [nvarchar](10) NULL,
[door_name] [nvarchar](20) NULL,
[device_id] [nvarchar](20) NULL,
[site_id] [nvarchar](12) NULL,
[ip] [nvarchar](16) NULL,
[device_name] [nvarchar](20) NULL,
[ncuip] [nvarchar](16) NULL,
[emp_id] [nvarchar](20) NULL,
[sdate] [nvarchar](10) NULL,
[emp_dep] [nvarchar](12) NULL,
CONSTRAINT [PK_rsa241inmemory] PRIMARY KEY nonCLUSTERED hash ([num]) with (bucket_count=40000))
WITH (memory_optimized=on,durability=schema_and_data)
GO
select * from [rsa241inmemory]
Control Flow
Note: 依照硬體資源修改DefaultBufferSize和 DefaultBufferMaxRows 可提高效能。
Data Flow
測試結果
傳統資料表with BDD處理時間約18秒
In-Memory資料表with BDD處理時間約8秒
參考
Balanced Data Distributor Transformation
The “Balanced Data Distributor” for SSIS
Parallelize Data Flows with SSIS Balanced Data Distributor Transformation
An Introduction to SSIS Balanced Data Distributor Transformation Component