4.11.20229 min

Bulldogjob

English 4 IT. Praktyczny kurs angielskiego dla IT

Sprawdź, w jaki sposób podszlifować swój angielski tak, aby znalazł on zastosowanie w branży IT.

Artykuł stanowi fragment książki pt. „English 4 IT. Praktyczny kurs języka angielskiego dla specjalistów IT i nie tylko” Beaty Błaszczyk (Helion 2016).

6.1. What is Business Intelligence?

Business Intelligence (BI) can be defined as a process of collecting business data and data analysis which goal is to provide proper information in order to boost business performance. It’s role is to help executives, business managers and other end-users make sound and informed business decisions as efficiently as possible.

BI encompasses a variety of tools, applications and methodologies which help organizations collect data from internal or external sources, prepare it for analysis, run queries against the data included in database and finally to create reports, interactive slice-and-dice pivot table analyses, dashboards and other data visualizations. Some benefits of using BI tools include:

acceleratedand improved decision-making process based on new and historical data gathered from the source systems;
optimization of business processes and increased operational efficiency;
possibility to spot new market trends, prepare forecasts and even conduct what-if analyses, without the need to guesstimate;
identification of cross-selling and up-selling opportunities as a result of leveraging customer data.

6.2. BI system architecture

A BI system architecture includes the following components presented in figure 6.1 below.

Figure 6.1. Architecture of BI system. Source: Vatovec Krmac E., “Intelligent Value Chain Networks: Business Intelligence and Other ICT Tools and Technologies in Supply/Demand Chains”, Supply Chain Management — New Perspectives, edited by Sanda Renko, InTech, August 29, 2011.

The data sources layer provides data which comes from many different data storage systems. It includes data from relational databases, flat files, spreadsheets and other external sources. This data is therefore inconsistent, fragmented and not standardized.

Data integration layer involves ETL (Extraction, Transformation and Loading) processes which are responsible for data capture, data validation, data cleansing (or scrubbing), data transformation and loading into the data warehouse. In the first step data is extracted from their source to staging area. It is a temporary data store loaded with data from source systems. So, in this case, loading on-the-fly to destination database doesn’t take place. After loading the data to data warehouse, it can be removed from the staging area.

In the next step, data validation with the use of validation rules is required to confirm whether data extracted from the source systems has correct and expected values. Sometimes data profiling is carried out at this stage in order to collect statistics about data such as record counts or range of values as well as information about table structures (definitions about each field or column and their data types).

The cleansing process ensures that inconsistent and invalid data is cleaned before it is loaded to target database. In this process inconsistent or duplicate values are identified and violation of business rules is detected and corrected. As far as the transformation process is concerned, it involves for example changing data formats to prepare it for loading. In the last phase, cleansed and transformed data is loaded to target database.

The data warehouse and data marts are a part of data storage layer. A data warehouse is a subject-oriented, time-variant and non-volatile separate data store designed specifically for reporting and analytics, with data periodically refreshed. A data warehouse must also be carefully designed to meet performance requirements. It is often partially denormalized in order to optimize query and analytical performance. The term enterprise data warehouse means that it is a data store used by the entire company, not just by some departments.

A data mart is a subset of data warehouse. It is a data store created for specific group of users with comparable data needs. It is also created in companies to offload the query workload and often includes aggregated data, ready for quick reporting.

We mustn’t forget about the last layer, which is analysis and presentation layer. It uses the information stored in data storage layer. Data analysis components include software used to present information to business users, so that they are able to conduct analyses. These include the following analytical technologies:

spreadsheets;
online analytical processing (OLAP)and multidimensional online analytical processing (MOLAP) with multidimensional expressions (MDX) or data analysis expressions (DAX) applied;
data miningand text mining tools;
statistical analysistools, etc.

Some of these tools enable the user to slice-and-dice data, perform roll-up, drill-down and drill-across activities or ad-hoc queries, as well as use analytical cubes.

Presentation layer is crucial for presenting information in visually appealing and eye-catching ways in order to help analysts and managers make informed decisions. BI presentation tools include:

data visualization tools,
dashboards and scorecards,
executive reporting tools,
self-service BI (SSBI),
mobile BI tools,
open source BI tools.

6.3. Star schema vs. snowflake schema

Database schemas are logical representations of database structure. A database schema is a collection of database objects, such as tables, views, indexes, synonyms. Two most commonly applied data warehouse schemas, which are data modelling techniques, are star schema and snowflake schema.

In a star schema, denormalized dimension tables are organized around a central fact table. This dimension table can also be a slowly changing dimension (SCD). The fact table is joined to them by foreign key references. Every dimension table includes a primary key on one or more columns and a set of attributes describing the dimension. Dimension tables should have relationships only with fact tables. The key columns in dimension and fact tables are filled with surrogate key values (also called meaningless key values) which are usually numbers with no meaning to business users. They are used to obtain never-changing key values which represent business objects or business events. If fact tables share the same dimension tables, these dimension tables are called conformed dimension tables.

The second type of database schema is snowflake schema which includes normalized dimension tables to avoid redundancies. Another difference between star and snowflake schema is that in the case of snowflake schema dimension tables have relationships with one another. It means that there are several dimension tables for each dimension as opposed to star schema. When querying the database, in the case of snowflake schema, there will be more joins than in star schema, and the computation of the result of the query will be more time-consuming for snowflake schema than in the case of a query run on star schema.

6.4. Gartner Magic Quadrant

Every year vendors of BI tools and analytics platforms are rated by Gartner Inc. upon two criteria — execution of stated vision and their performance on the market. Gartner Magic Quadrant serves as a first step to consider technology providers in terms of specific investment opportunity.

A Magic Quadrant presents a graphical competitive positioning of the following four types of technology providers:

Leaders: These are the companies which execute well against their current vision and are expected to prosper continuously in the future.
Visionaries: These are the companies which understand where the market is going or have a vision of their future change, but do not yet execute their vision well.
Niche players: These are the players which focus successfully on a small segment and do not out-innovate or outperform other players on the market.
Challengers:These are the companies which execute their vision well today and may dominate a large segment in the future, but they do not understand yet the market direction.

Figure 6.2. Magic Quadrant for Business Intelligence and Analytics Platforms for the year 2016

Sources: Parenteau J., Sallam R. L., Howson C., Tapadinhas J., Schlegel K., Oestreich T. W., Magic Quadrant for Business Intelligence and Analytics Platforms, 4 February 2016. Retrieved from http://www.gartner.com/doc/reprints?id=1-2XXET8P&ct=160204

6.5. Vocabulary

(to) accelerate — przyspieszać

ad-hoc query — zapytanie ad hoc

aggregated data — dane zagregowane

analysis and presentation layer — warstwa analizy i prezentacji

analytical cube — kostka analityczna

(to) boost business performance — pobudzać wzrost wyników działalności gospodarczej

business intelligence (BI) — analityka biznesowa/business intelligence

challenger — pretendent

cleansing process — proces oczyszczania

(to) collect data — gromadzić dane

computation — obliczenie

(to) conduct analysis — przeprowadzić analizę

conformed dimension table — współdzielona tabela wymiarów

cross-selling — sprzedaż krzyżowa

dashboard — kokpit menedżerski

data analysis expressions (DAX) — wyrażenia analizy danych

data capture — zbieranie danych

data cleansing — oczyszczanie danych

data format — format danych

data integration layer — warstwa integracji danych

data loading — ładowanie danych

data mart — składnica danych/tematyczna hurtownia danych

data mining — eksploracja danych

data profiling — profilowanie danych

data scrubbing — oczyszczanie danych

data sources layer — warstwa źródeł danych

data storage layer — warstwa przechowywania danych

data storage system — system przechowywania danych

data store — magazyn danych

data transformation — przekształcanie danych/transformacja danych

data type — typ danych

data validation — sprawdzanie poprawności danych

data visualization tool — narzędzie do wizualizacji danych

data visualization — wizualizacja danych

data warehouse — hurtownia danych

database schema — schemat bazy danych

(to) denormalize — dokonać denormalizacji

(to) detect — wykrywać

dimension table — tabela wymiarów

drill-across — drążenie w poprzek

drill-down — drążenie w głąb/drążenie w dół

duplicate value — duplikat wartości

(to) encompass — obejmować

end-user — użytkownik końcowy

(to) ensure — zapewniać

enterprise data warehouse — korporacyjna hurtownia danych

ETL process — proces ETL/proces ekstrakcji, transformacji i ładowania

executive reporting tool — narzędzie do raportowania dla kadry zarządzającej

executive — członek zarządu

expected value — wartość oczekiwana

eye-catching — przykuwający wzrok

fact table — tabela faktów

flat file — plik płaski

forecast — prognoza

foreign key reference — odwołanie do klucza obcego

fragmented — rozdrobniony/podzielony na części

(to) guesstimate — szacować „na oko”

inconsistent — niespójny

informed business decision — świadoma decyzja biznesowa

invalid data — nieprawidłowe dane

investment opportunity — możliwość inwestycyjna

join — złączenie

leader — lider

(to) leverage customer data — wykorzystać dane o klientach

loading on-the-fly — ładowanie w locie

meaningless key — klucz sztuczny

mobile BI tools — narzędzia do analizy business intelligence na urządzeniach mobilnych

multidimensional expression (MDX) — wyrażenie wielowymiarowe

multidimensional online analytical processing (MOLAP) — wielowymiarowe przetwarzanie analityczne online

niche player — gracz niszowy

non-volatile — nieulotny

(to) offload — odciążać

online analytical processing (OLAP) — przetwarzanie analityczne online

open source BI — narzędzia typu open source do analiz business intelligence

operational efficiency — efektywność operacyjna

(to) out-innovate — wykraczać poza ramy w zakresie wprowadzania innowacji

(to) out-perform — wykraczać poza ramy w zakresie wyników działalności gospodarczej

periodical — okresowy

positioning — pozycjonowanie

primary key — klucz główny

query workload — obciążenie wskutek wykonywania zapytania

record count — liczba rekordów

redundancy — redundancja/nadmiarowość

(to) refresh — odświeżać

relational database — relacyjna baza danych

relationship — związek/relacja

roll-up — konsolidowanie/zwijanie

(to) run a query — uruchomić zapytanie

scorecard — karta wyników

self-service BI (SSBI) — samodzielnie przeprowadzana analiza business intelligence

(to) slice-and-dice data — analizować dane w dowolnych przekrojach i rzutach

slice-and-dice pivot table analysis — analiza danych w dowolnych przekrojach i rzutach za pomocą tabeli przestawnej

slowly changing dimension (SCD) — wymiar wolnozmienny

snowflake schema — schemat płatka śniegu

sound business decision — rozsądna/trafna decyzja biznesowa

(to) spot — zauważać/dostrzegać

spreadsheet — arkusz kalkulacyjny

staging area — obszar tymczasowy

star schema — schemat gwiazdy

(to) state — określać/ogłaszać

statistical analysis — analiza statystyczna

subject-oriented — tematyczny

surrogate key — klucz sztuczny

target database — docelowa baza danych

temporary data store — tymczasowe miejsce przechowywania danych

text mining — eksploracja tekstu/danych tekstowychtime-consuming — czasochłonny

time-variant — zmienny w czasie

(to) transform — przekształcać

up-selling — sprzedaż produktów droższych

validation rule — reguła poprawności

violation of business rules — naruszenie reguł biznesowych

visionary — wizjoner

visually appealing — atrakcyjny wizualnie

what-if analysis — analiza typu „co-jeśli?”

„English 4 IT. Praktyczny kurs języka angielskiego dla specjalistów IT i nie tylko” Beaty Błaszczyk można kupić w księgarni Helion.pl.