Posts

Showing posts from 2016

Revolution R Enterprise inside SQL Server - Forecasting using Linear Regression

Find installed R Packages in SQL/R Process sp_execute_external_script @language = N'R', @script = N'OutputDataSet <- data.frame(installed.packages())' Install Additional R Packages https://msdn.microsoft.com/en-in/library/mt591989.aspx  Revolution R Enterprise inside SQL Server EXECUTE   sp_execute_external_script @language = N'R' ,@script = N' salesData <- InputDataSet fit <- lm(SalesAmount ~ Year + Quarter, data = as.data.frame(salesData)) data2017 <- data.frame(Year=2017, Quarter=1:4) sales2017 <- predict(fit, newdata=data2017) style <- c(rep(1,14), rep(2,4)) salesForecast <- data.frame(data2017, sales2017) library(sqldf) OutputDataSet <- sqldf("select Quarter, Year, SalesAmount from salesData UNION select Quarter, Year, sales2017 from salesForecast") ' ,@parallel = 1 ,@input_data_1 = N'select d.CalendarQuarter Quarter, CalendarYear Year, sum(f.SalesAmount) SalesAm...

Business Intelligence Basics

BI Concept:  Star Schema: The star schema has a center, represented by a fact table, and the points of the star, represented by the dimension tables. Snowflake Schema: In a snowflake, the dimension tables are normalized. ODS: ODS should be considered only when information retrieval requirements are to be real time and cannot access OLTP systems because of criticality of resources. Dimension Types: Changing Dimension:  Changing dimension is the dimension which has at least one attribute whose value would change over the time. Slowly Changing Dimensions: Attributes of a dimension that would undergo changes over time. For example: Employee work location  • Type I: The existing record is updated with change. • Type II: History is preserved, a new record is added to dimension and old record is marked as inactive.  EndDate or IsLatest is generally used to represent latest record of a business key. • Type III: Type III preserve...

SSAS Dimension Usages - Relationship Types

Dimensions and Measure Groups can be related in different ways Regular - A Direct relationship.  Key column for the dimension is joined directly to the fact table Fact - A Column in the fact table can be treated as a  degenerate dimensions  (usually used for drillthrough).  Useful dimensional data is sometimes stored in a fact table to reduce duplication. Referenced - A Dimension joined through another dimension is treated as if directly joined to the fact table.  Key column for the dimension is joined indirectly to the fact table through a key in another dimension table Many-to-Many - Handles many-to-many fact relationships. I t is frequently useful to join a single fact to multiple dimension members.

Enable Usage-Based Optimization in SSAS

Usage-Based Optimization is a fantastic feature in SSAS to improve the most frequently used MDX queries of any cube database. This feature is used to design aggregations based on actual user queries using Business Intelligence Development Studio. But when ever you try to design  Usage-Based Optimization  you might have seen all the options  disabled.  This uses the technique of creating aggregations on FREQUENTLY USED queries in the given time interval.  We can enable the options in Usage-Based Optimization wizard. Connect to SSAS using SSMS (Sql Server Management Studio). Right click on the server and select properties as shown below. set CreateQueryLogTable to  true. Set the value of  QueryLogConnectionString  to any database where you wish to store your queries. Set the value of  QueryLogSampling  to the appropriate value for your environment (the default value of 10 means every 10th query will be logged). The lowe...

SSAS Storage Mode

MOLAP - The partition storage mode is standard MOLAP. Proactive caching is disabled. You need to process the partition to refresh data. Scheduled MOLAP - Same as MOLAP except that the server will process the partition on a daily schedule. Automatic MOLAP - The default silence interval is set to 10 seconds. As a result, the server will not react if the data change batches are fewer than 10 seconds apart. If there is not a period of silence, the server will start processing the cache in 10 minutes. Medium-latency MOLAP - Similar to Automatic MOLAP except that the MOLAP cache expires in 4 hours. Low-latency MOLAP - The MOLAP cache expires in 30 minutes.   Real-time HOLAP - As with standard HOLAP, partition data is stored in the relational database, and aggregations are stored in the cube. Aggregations are rebuilt as soon as a data change notification is received. Real-time ROLAP - As with standard ROLAP, partition data and aggressions are stored in the ...

SQL Server Compress and Decompress Functions

COMPRESS and DECOMPRESS functions can be used as shown below: SELECT COMPRESS ('Compress and Decompress Function SQL Server') select CAST(  DECOMPRESS( COMPRESS ('Compress and Decompress Function SQL Server') )  AS NVARCHAR(MAX) ) FROM dbo.Product

Common Table Expressions (CTE) for retrieving Last Backup information

--Get List of Last Full Backup or No Backup ;WITH LAST_FULL_BACKUP_LIST AS ( SELECT SYSDBLIST.name AS Name, MAX(BUSETS.backup_finish_date) AS Last_Backup_Finish_DateTime FROM MASTER.sys.databases AS SYSDBLIST LEFT OUTER JOIN msdb.dbo.backupset AS BUSETS ON SYSDBLIST.name = BUSETS.database_name WHERE SYSDBLIST.name<>'TempDB' AND BUSETS.[type] ='D' OR BUSETS.[type] IS NULL GROUP BY SYSDBLIST.name ) , --Get List of Last Differential Backup LAST_DIFFERENTIAL_BACKUP_LIST AS ( SELECT SYSDBLIST.name AS Name, MAX(BUSETS.backup_finish_date) AS Last_Backup_Finish_DateTime FROM MASTER.sys.databases AS SYSDBLIST LEFT OUTER JOIN msdb.dbo.backupset AS BUSETS ON SYSDBLIST.name = BUSETS.database_name WHERE SYSDBLIST.name<>'TempDB' AND BUSETS.[type] ='I' GROUP BY SYSDBLIST.name ) , --Get List of Last Log Backup LOG_BACKUP_LIST AS ( SELECT SYSDBLIST.name AS Name, BUSETS.backup_finish_date AS Backup_F...
Image
Columnstore Indexes ·          In-memory, compressed data in pages based on columns instead of rows Types of Columnstore Index •        Clustered Columnstore Indexes –       SQL Server 2014 Enterprise, Developer, and Evaluation Edition Only –       Includes all columns in the table –       Only index on the table –       Optimizes storage for compression and performance –       Updatable CREATE CLUSTERED COLUMNSTORE INDEX [IX_CS_FactProductInventory] ON dbo. FactProductInventory; •        Non-Clustered Columnstore Indexes –       Includes some or all columns in the table –       Can be combined with other indexes –       Read-Only CREATE NONCLUSTERED...

Buffer Pool Extension

Buffer Pool Extension •        Extends buffer cache to non-volatile storage •        Improves performance for read-heavy OLTP workloads •        SSD devices are often more cost-effective than adding physical memory •        Simple configuration with no changes to existing applications. ALTER SERVER CONFIGURATION SET BUFFER POOL EXTENSION ON     (FILENAME = 'F:\SSDCACHE\Example.BPE', SIZE = 60 GB); GO Returning configuration buffer pool extension information SELECT path, file_id, state, state_description, current_size_in_kb FROM sys.dm_os_buffer_pool_extension_configuration; Returning the number of cached pages in the buffer pool extension file SELECT COUNT(*) AS cached_pages_count FROM sys.dm_os_buffer_descriptors WHERE is_in_bpool_extension <> 0; Setting process affinity The examples in this section show ...

Creating Memory-Optimized Tables & Indexes

Image
Memory-Optimized Tables •        Defined as C structs, compiled into DLLs, and loaded into memory •        Can be persisted as filestreams, or non-durable •        Do not apply any locking semantics •        Can be indexed using hash indexes •        Can co-exist with disk-based tables •        Can be queried using Transact-SQL •        Cannot include some data types, including text, image, and nvarchar(max) •        Do not support identity columns or foreign key constraints Memory-Optimized Table Scenarios •        Optimistic concurrency optimizes latch-bound workloads: –       Multiple concurrent transactions modify large numbers of rows –   ...