Cubes is a light-weight open source multidimensional modelling and OLAP toolkit for development reporting applications and browsing of aggregated data written in Python programming language released under the MIT License.
Original author(s) | Stefan Urbanek[1] |
---|---|
Initial release | March 27, 2011 |
Stable release | 1.1
/ July 2, 2016 |
Repository | github |
Written in | Python |
Operating system | Cross-platform |
Type | OLAP |
License | MIT License[2] |
Website | cubes |
Cubes provides to an analyst or any application end-user "understandable and natural way of reporting using concept of data Cubes – multidimensional data objects".
Cubes was first publicly released in March 2011. The project was originally developed for Public Procurements of Slovakia.[3] Cubes 1.0 was released in September 2014 and presented on the PyData Conference in New York[4]
Features
edit- OLAP and aggregated browsing (default is ROLAP)
- logical model of OLAP cubes in JSON or provided from external sources
- hierarchical dimensions (attributes that have hierarchical dependencies, such as category-subcategory or country-region)
- multiple hierarchies in a dimension
- arithmetic expressions for computing derived measures and aggregates
- localizable metadata and data
Cubes is capable of handling large amounts of data and complex queries. According to a review by TechTarget, Cubes can handle "data volumes in the hundreds of millions of rows" and "complex queries and calculations that require multi-level aggregations and dynamic subsetting." Additionally, the review notes that Cubes is well-suited for smaller organizations or teams that don't require the complexity and scalability of enterprise-level OLAP solutions.[5]
Model
editThe logical conceptual model in Cubes is described using JSON and can be provided either in a form of a file, directory bundle or from an external model provider (for example a database). The basic model objects are: cubes and their measures and aggregates, dimensions and their attributes, hierarchies. Logical model also contains mapping from logical attributes to their physical location in a database (or other data source).
Example model:
{
"cubes": [
{
"name": "sales",
"label": "Our Sales",
"dimensions": [ "date", "customer", "location", "product" ],
"measures": [ "amount" ]
}
]
"dimensions": [
{
"name": "product",
"label": "Product",
"levels": [
{
"name":"category",
"label":"Category",
"attributes": [ "category_id", "category_label" ],
},
{
"name":"product",
"label":"Product",
"attributes": [ "product_id", "product_label" ],
}
]
},
...
]
}
Operations
editCubes provides basic set of operations such as Data drilling and filtering (slicing and dicing). The operations can be accessed either through Python interface or through a light web server called Slicer.
Example of the python interface:
import cubes
workspace = Workspace("slicer.ini")
browser = workspace.browser("sales")
result = browser.aggregate()
print(result.summary)
Server
editThe Cubes provides a non-traditional OLAP server with HTTP queries and JSON response API. Example query to get "total amount of all contracts between January 2012 and June 2016 by month":
http://localhost:5000/cube/contracts/aggregate?drilldown=date&drilldown=criteria&cut=date:2012,1-2012,6&order=date.month:desc
The response looks like:
{
"summary": {
"contract_amount_sum": 10000000.0
},
"remainder": {},
"cells": [
{
"date.year": 2012,
"criteria.code": "ekonaj",
"contract_amount_sum": 12345.0,
"criteria.description": "economically best offer",
"criteria.sdesc": "best offer",
"criteria.id": 3
},
{
"date.year": 2012,
"criteria.code": "cena",
"contract_amount_sum": 23456.0,
"criteria.description": "lowest price",
"criteria.sdesc": "lowest price",
"criteria.id": 4
},
...
"total_cell_count": 6,
"aggregates": [
"contract_amount_sum"
],
"cell": [
{
"type": "range",
"dimension": "date",
"hierarchy": "default",
"level_depth": 2,
"invert": false,
"hidden": false,
"from": ["2012", "1" ],
"to": ["2015", "6" ]
}
],
"levels": {
"criteria": [ "criteria" ],
"date": [ "year" ]
}
}
The simple HTTP/JSON interface makes it very easy to integrate OLAP reports in web applications written in pure HTML and JavaScript.
The Slicer server contains endpoints describing the cube metadata which helps to create generic reporting applications[6] that don't have to know the database model structure and conceptual hierarchies up-in-front.
The Slicer server is written using the Flask (web framework).
ROLAP and SQL
editThe built-in SQL backend of the framework provides ROLAP functionality on top a relational database. Cubes contains a SQL query generator that translates the reporting queries into SQL statements. The query generator takes into account topology of the star or snowflake schema and executes only joins that are necessary to retrieve attributes required by the data analyst.
The SQL backend uses SQLAlchemy Python toolkit to construct the queries.
See also
editReferences
edit- ^ Stefan Urbanek is the creator of Cubes and Data Brewery.
- ^ "DataBrewery / cubes / blob / master / LICENSE". Github. Retrieved 21 February 2015.
- ^ Public Procurements of Slovakia by Transparency International Slovakia
- ^ Cubes 1.0 Overview at PyData NYC 2014 (video).
- ^ "Open source Cubes OLAP server suits business users under the gun". SearchBusinessAnalytics. Retrieved 2023-03-28.
- ^ Cubes Viewer