Skip to main content

Extract Translate Load

So, What it's like Extract Translate Load ?
  - ETL was a solution to get analytics at scale. Once we have huge data at scale of hundreds of tera bytes or even at peta scale, we may need a HPC to ask questions on such data. Using commodity compute horizontally would be cost effective in most of the businesss cases. Initially Hadoop had its helping hand in the process, however when Spark could do it efficiently the world said "Why not?".
For us to get analytics on huge data largely unstructred and from hetrogenous sources, like every other engineering problem we divided the problem so we can conquer it with ease. We made a layer to Extract, this layer would just abstract us different data sources and get us the data. Traslate layer would structure the data for us so that our logical questions would fit into the arena.Load come in where we need to distribute the compute task at hand to large commodity clusters. Here's where big data framework would be a friend at help.

Comments

Popular posts from this blog

Event Sourcing with CQRS.

  The way event sourcing works with CQRS is to have  part of the application that models updates as writes to an event log or Kafka topic . This is paired with an event handler that subscribes to the Kafka topic, transforms the event (as required) and writes the materialized view to a read store.

GraphQL microservices (GQLMS)

I'm curios of GraphQL !     -  GraphQL is an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data. GraphQL was developed internally by Facebook in 2012 before being publicly released in 2015. It should be solving a problem in querying data !     -GraphQL lets you ask for what you want in a single query, saving bandwidth and reducing waterfall requests. It also enables clients to request their own unique data specifications. A case study ?!    -https://netflixtechblog.com/beyond-rest-1b76f7c20ef6 So, This is just another database technoloy ?  -  No. GraphQL is often confused with being a database technology. This is a misconception, GraphQL is a   query language   for APIs - not databases. In that sense it’s database agnostic and can be used with any kind of database or even no database at all. Source:   howtographql.com