Utilizing SiSense and Presto For Distributed Data Analysis

Data Analysis Using SiSense

1280px Sisense Logo.svg

SiSense is a full-fledged data analysis platform that provides actionable data analysis to its users for data-driven management. Being an end-to-end data analysis platform makes leveraging disparate sources of data, carrying out analysis on the data, and presenting the results to the users in an easy-to-use manner using it seamless and simple. Some of the data sources it supports are data files, MySQL, Oracle, Salesforce, and Big Query. By integrating different data stores, the platform can also overcome the problem of data silos and provide a holistic view of available data. Some of the organizations using SiSense include Nasdaq, Gitlab, Phillips, and Tinder.

Understanding How SiSense Works

SiSense is a web-based data analytics platform whereby servers are used for data ingestion, processing, and user interaction. The SiSense Server can be installed on a single machine or deployed in a cluster. Connecting to the data sources is handled by the ElastiCube Data Hub in two modes, self-managed Live Models or proprietary, super-fast ElastiCube Models. The ElastiCube Server and Application server handle data management and processing. The Web Server provides business users a way to interact with SiSense via a web app, mobile app, and REST API. Client apps handle various essential tasks like data source management, server management, distributed messaging, and node orchestration.

sisense db
image from sisense.com

SiSense provides users with different ways of utilizing it for BI including manual installation and/or cloud deployment. Data engineers handle data connections and management to create data models ready for analysis. Data developers use the modes to carry out ad-hoc analysis, develop custom BI solutions and create UI  artifacts to be presented to business users.

What is Presto?

presto logo

Presto was created by Facebook to handle the huge amounts of data it generates every minute. Like Hadoop, it could carry out distributed and parallel processing over numerous nodes. However, rather than writing intermediate results to disk, it holds them in memory. This allows for data processing in a matter of seconds instead of hours or days more common in Hadoop jobs. 

Distributed Data Analysis Using SiSense and Presto

The recommended way of using SiSense for big data and near real-time data analysis is multi-node deployment. This allows for optimal performance and efficiency, especially when using ElastiCube models. Combining SiSense with Presto allows the creation of highly scalable BI solutions with better performance that are easier to create.

Users benefit by having two distributed systems that are each highly optimized for the tasks they need to handle. Using Presto for data management provides access to more data stores and having dedicated and easily separable clusters handling data management. It is also highly optimized for handling data management and query execution due to its parallel and distributed architecture. It simplifies data management as it offers a single source of truth by creating a virtual data warehouse in which SiSense can use as its data source. SiSense then handles BI and analysis.

Screen Shot 2021 05 05 at 3.17.59 PM

Ahana Cloud is the cloud-native SaaS managed service for Presto, see how you can turbocharge SiSense in 30 minutes!

Get Started with Presto & SiSense