Conf42 JavaScript 2022 - Online

Data analytics in browser with AlaSQL.JS

Video size:

Abstract

Today’s PCs are more powerful than the chips used in the Apollo mission. Participants will learn to implement a robust frontend data analytics framework using AlaSQL.JS. Data can be imported to AlaSQL very fast, then custom queries can be written to do different complex analytical functions without hitting the server. Participants will also learn to build a custom reporting framework that sits in the browser and is customized for different users’ needs, which is highly desirable for multi-tenant architecture. In the end, participants will get to know about a few more interesting use cases and open source projects to contribute to.

Summary

  • Today we are going to discuss technology and approach to solving data analysis problem on any devices that run JavaScript. AlaSQL is a lightweight, easy to use client side in memory SQL database designed to work in browsers as well as in node JS. Activating data at age makes it possible to ask patent questions and get more timely answers.
  • AlaSQL also provides a way to use the browser's local storage and DOM storage as a data storage. This is silver bullet for all the edge operations. There are certain limitations. The number of files you increase can make the application suffer.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Jamaica real time feedback into the behavior of your distributed systems and observing changes exceptions errors in real time allows you to not only experiment with confidence but respond instantly to get things working again. Close hi, I'm Michael Korra, CEO and co founder of Globe. Today me and my co speaker Bhavijit are going to talk about how we can leverage the processing power of client devices to do data processing and data analysis. One such framework is AlaSQL that we are going to go deep dive into. So by the time this talk will be over, 12 billion megabyte of data will be generated across the world and a massive amount of this data will be generated outside the traditional data center. So cloud will extend to the cases and it will not be cloud versus edge, it will be cloud with edge. So that's the future that we're going towards. Activating data at age makes it possible to ask patent questions and get more timely answers. Age can be your mobile, your laptop, smartwatch, gaming consoles, home appliances can be anything. Today we are going to discuss technology and approach to solving data analysis problem on any devices that run JavaScript. And this talk is not going to explain the differences or comparison between different SQL libraries available for the browsers. It only focuses on ALS. Now some background when someone shares an analysis implementation of JavaScript, you're not just seeing a static snapshot of the work, you are basically running it real time on browser live. So you can go beyond passive reading by querying data, tweaking assumptions and asking more questions, searching and everything. La SQL is a lightweight, easy to use client side in memory SQL database designed to work in browsers as well as in node JS. It is open source. It has strong focus on query speed, data source flexibility for the relational database. Like you can import cases, Excel, SQLite databases and everything over to here. Moreover, it can handle schemaless data as well as graph data. It can handle both traditional relationship tables as well as nested JSOns which is nosql. You can export store import data from local storage index, DB, Excel and many more. So AlaSQL was initially created in 2014 with the need for running NoSQL capabilities on browsers because at that point in time there are very few engines which are capable of running NoSQL features and then there are a lot of other significant releases. So here are the details for you. JavaScript is one of the most flexible language or frameworks available for communication. It is flexible as well as it supports experimentations. Just think of you can just open the developer tools and do experimentations directly on a wave application. It also supports collaboration real time. So think of you are using analysis with your colleagues remote over the Internet on a fixed set of data which is running in individual browsers. Now there are multiple databases available for client side processing. AlaSQL sits into the requirement of use cases where we don't require a persistent storage, we just need fast in memory data processing for joining, filtration, grouping of data, searching and everything. Moreover, it is also suitable for a use case where there is a large volume of data on the server. You query to do the initial filtration, take the data back to the browser and then run processing, pre processing and other sort of queries on top of a few specific features of AlASQL are fundamentally important for this topic where we are discussing about processing data on age on browsers. So it is widely supported, it is extensible, it has the ability to execute SQL against any data set, against JSON arrays and all. So it is fast, it is in memory. SQL data processing is very much important for bi use cases, for ERP applications, fat clients and many more. ETL is also one of the very important use case. It also supported in most of the major browsers, mobile applications, node cases and all. It supports inbuilt compilation, query optimization like any other SQL engines. It supports indexing, pre indexing as well. Complex join operations can be run very efficient. Now I'm going to discuss about few fundamentals of AlaSQL, so consider a use case where you need to augment the capabilities of SQL statements with actual JavaScript function, custom or inbuilt. So you can also do that. I'll show you some examples after the stop after this slide. It has flexibility to import data from various sources, be it Excel cases, JSON and many more. If you're working in a use case where you need to extend the in memory capabilities for persistent storage so that you can also do them. Export the data to any of your required format, then few of the performance threads so the compiled statements and functions are created when you actually write the SQL statements where expressions are pre filtered for the joins, joint tables are pre indexed. ArieSQL uses hash tables for its indexes and upon index creation all entries in the table are hashed and stored in the database object. Sorry, in the JavaScript object. Now let me talk about the engine. So when a statement in SQL is parsed and processed, so it is segregated into an abstract syntax tree which is kind of a nested JSON. So I have given you one example. So where this is the SQL statement and after the parsing and formation of the abstract syntax. This is something that is actually processed in the engine. Now I will hand it over to my co speaker Akajit for a technical deep dive. Before going into the details, I would like to share a small application which I have created which can shed some light into the potential of this library. What I have created is an application which can pick a local file and then render the records and then we can apply any operations you like. For the example purpose, I have created two simple examples. Search duplicate, count and average. Search duplicate is going to give you all the duplicate records and count and average as it states can provide you all the count, average and sum. These can be done by basic JS logics, but what about the complex join or joining two files at a time? Those will not be as simple as such to me. LSp provides variety ways of reading files importing from csv text files Excel files this feature can be useful in cases where we want to create a software where we just pass an excel or csv file and do some operations over it. Now I can now what if a requirement comes which can't be solved by quake? AlaSQL provides its user defined functions which we can use to create our own functions and then use it in the quake. AlaSQL also provides a way to use the browser's local storage and DOM storage as a data storage. Let's say for a scenario where you have to send a large data set to the server for processing. What you can do is you can do the processing in the client side and then send the data to the. AlaSQL also provides a way to use the browser's local storage and DOM storage as a data storage. Let's say for a scenario you want to send a huge amount of data set to the server for processing. What you can do is you can process the data in the client side and then send the process data to the server. This can be customized in the other way around. Also, let's say you have a huge amount of data set present in the server but it needs preprocessing every time client interacts with it. What you can do is you can send the whole data to the client side and then do the processing and render the data in the client side rather than requesting it every time any kind of changes is made by the client. This is silver bullet for all the edge operations. There are certain limitations. Let's say you are bringing in a huge amount of data set from the server side to the client side the amount of time it will take to take the data from the server to the client side. That time will make the application suffer. Then there are join operations. Let's say you are uploading multiple files to apply join operation across three to four files. Now keep in mind the number of files you increase can make the application suffer because it's going to apply the operations over multiple records which the data is been added in library. Also, as they have mentioned in their talks, joining a subselect does not work. Thank you.
...

Gaurab Patra

Co-Founder @ Techweirdo Consultancy

Gaurab Patra's LinkedIn account Gaurab Patra's twitter account

Bhagyajit Jagdev

SDE III @ Techweirdo Consultancy

Bhagyajit Jagdev's LinkedIn account



Awesome tech events for

Priority access to all content

Video hallway track

Community chat

Exclusive promotions and giveaways