Analysis & Acronyms IT BI DW ETL DM PB: 2009

mardi 29 décembre 2009

Windows Family timeline

mercredi 23 décembre 2009

CA Erwin Data Modeler Community Ed 7.3

Nova instalaçao: CA ERwin Data Modeler Community Edition 7.3
As vezes me sinto ultrapassado usando software antigo... o erwin que eu estava usando ate agora é a versao 4. Agora todos os meus modelos terao q ser refeitos. Uma merda!
Engracado como as evolucoes se dao. Antes a extensao era *.er1. Agora é: *.erwin
Bon, C'est la vie!

jeudi 3 décembre 2009

Job Seek Model

Nao ligar para os data types que estao errados.

mercredi 2 décembre 2009

Google's secrets

BigTable and GFS (Google File System)

lundi 9 novembre 2009

CD Collection Model - Erwin

First version:
RED Tables are not valid for the model.

jeudi 5 novembre 2009

Working with ERwin

The doubt about type identifying or non-identifying:

identifying means migrate the pk to become a new pk for the other table as well.

Non-identifying means migrate the pk to become foreign key for the other table.

create two tables:

to break up a N:N relationship, use IDENTIFYING type.

I still don't know how to use this symbol, knowing that it doesn't help much.

Working with Power Designer

When I stay some time away from Power designer, I forget the basics stuffs to do. So to remember, I put this post.

The first thing we should do is to create the tables as usual:

then you create the dependency between them:

now you must create an index UNIQUE for the PK, double click on the table:

last step is to double click on the index you have just created to reference to the PK on the columns definition droplist box:

Voilà, after creating the index and point to the pk for the city table, now check

if you have any errors or warnings, click F4 for "check model". Good job.

dimanche 1 novembre 2009

Data Modeling Studies: Band and Members

CA ERwin

Sybase PowerDesigner

MS Access 2000

vendredi 30 octobre 2009

Sugar CRM

http://www.sugarcrm.com/crm/demo/videos.html

CRM + BI

mercredi 28 octobre 2009

Data Diagrams

this is very helpful

http://www.databaseanswers.org/data_models/index.htm

http://www.databaseanswers.com/modelling_tools.htm

along with it, these are the main tools for modeling

CA Erwin
Sybase Power Designer

ETL Tools

ETL Tools
At present the most popular and widely used ETL tools and applications on the market are:

1 - IBM Websphere DataStage (Formerly known as Ascential DataStage and Ardent DataStage)
2 - Informatica PowerCenter
3 - Oracle Warehouse Builder
4 - Ab Initio
5 - Pentaho Data Integration - Kettle Project (open source ETL)
6 - SAS ETL studio
7 - IBM Cognos Decisionstream
8 - Business Objects Data Integrator (BODI)
9 - Microsoft SQL Server Integration Services (SSIS)
10 - ETI Extract

11. Elixir Repertoire 7.2.2 Elixir
12. Data Migrator 7.6 Information Builders
13. Talend Open Studio 3.1 Talend
14. DataFlow Manager 6.5 Pitney Bowes Business Insight
15. Data Integrator 8.12 Pervasive
16. Open Text Integration Center 7.1 Open Text
17. Transformation Manager 5.2.2 ETL Solutions Ltd.
18. Clover ETL 2.5.2 Javlin
19. ETL4ALL 4.2 IKAN
20. DB2 Warehouse Edition 9.1 IBM
21. Pentaho Data Integration 3.0 Pentaho
22. Adeptia Integration Server 4.9 Adeptia

tutorial

IBM RedBrick

IBM Red Brick™ Warehouse 6.3 delivers powerful, integrated analytics for business intelligence on demand.

The latest release of IBM Red Brick Warehouse -- version 6.3 -- can manage huge amounts of complex data, integrate with existing technologies to preserve your IT investments, and scale to meet the expanding needs of your business. It is available in two configurations: Red Brick Workgroup Edition and Red Brick Enterprise Edition.

Red Brick Workgroup Edition 6.3 is ideal for small- to medium-size businesses and departmental solutions that require the power of an enterprise database on a system with up to two processors. Red Brick Enterprise Edition 6.3 provides all of the features of the Workgroup Edition for SMP systems with three or more processors.

Highlights

•High-speed, high-volume query performance with support for thousands of users

•High-performance information management

•Linear scalability that accommodates many terabytes on SMP hardware

•Low cost of ownership through greatly simplified maintenance

Performance enhancements

•Dynamic segment elimination: Complementing compile-time SmartScan optimizations, the server also supports run-time or dynamic segment elimination. This further optimizes data access for table scans and TARGETjoins that use local indexes. Segment elimination is factored when determining the correct query plan to choose.

•Memory mapping for queries: To optimize access to dimension tables and their primary-key indexes. These objects are partially or wholly memory-mapped during query execution.

•TARGETjoin performance improvements: The performance of TARGETjoin queries is optimized, and single-column B-TREE indexes can now participate in TARGETjoin plans.

•Optimizer directives: The server supports SET commands for controlling the availability of specific STAR indexes for single and multifact joins and for adjusting thresholds that influence the overall choice of STARjoin and TARGETjoin plans.

•Vista Enhancement: Precomputed views grouped by nullable columns can now be maintained incrementally.

•Table Management Utility (TMU): Loader enhancements are made in memory usage and tuning.

Ease-of-use enhancements

•Alter Table with working segment: You can add or drop table columns into a predefined working segment. This option provides greater reliability and recoverability than other ALTER TABLE modes.

•Additional SQL OLAP functions: The server supports additional functions for analytic queries: PERCENT_RANK, CUME_DIST, PERCENTILE_CONT, and PERCENTILE_DISC. The server also supports the ROUND scalar function.

•Expression support: You can define arithmetic expressions and conditional CASE expressions for input columns inside the TMU control file.

•Shrink system catalog: To improve performance, you can use the rb_syscompact utility to reduce the size of the database system catalog.

•3GB address space on the Microsoft Windows platform: To improve scalability and performance, Red Brick executable's can use up to 3GB of virtual address space on the Windows platform.

•Multicharacter delimiter support: To extend delimiter support in the Table Management Utility (loader) and Export.

•Nested XML namespace support: To extend definition of XML data for loading.

•Support for .NET applications through the ODBC.NET data provider: Applications can use Microsoft's ODBC.NET provider and the .NET Framework 1.1 with the Red Brick ODBC driver to access data from Red Brick data warehouse.

see more

Teradata

FamilyMore Choices. Great Value. Competitive Edge.
The Teradata Purpose-Built Platform Family has expanded to fit all your business needs. With the latest appliance options including Extreme Data Appliance, Data Warehouse Appliance, Data Mart Appliance, and Active Enterprise Data Warehouse, your company can start small while trusting that your best in class infrastructure will grow with you.

*
see more

Sybase IQ

The world’s #1 Column based analytics server keeps getting smarter. With its new in-database analytics capability, Sybase IQ enables a new generation of analytics performance to meet a new generation of analytics challenges.

Sybase IQ Enterprise Edition was designed to enable many users to perform extremely fast, flexible, interactive analysis, and ad-hoc queries, using off-the-shelf query tools. Sybase IQ can be loaded using a variety of methods including the Sybase ETL server, from flat files, directly from Sybase Adaptive Server Enterprise or Replication Server, or through Enterprise Connect from non-ASE databases, and typically stores the data in less than the size of the original raw data. The Multiplex Grid option allows Sybase IQ engines to run on multiple nodes of a shared disk cluster to load or query a single image of the database. The Multiplex Grid option provides tremendous user scalability, load jobs scalability and high availability. For many data warehousing and DSS applications, Sybase IQ Enterprise Edition can improve query response time by up to 100 times.

see more

Netezza

video: here

The Netezza TwinFin™ system is the fourth generation Netezza appliance that once again sets the standard for price/performance for data warehousing and business intelligence (BI). TwinFin is a purpose-built, standards-based data appliance that architecturally integrates database, server and storage into a single, easy to manage system. TwinFin is designed for rapid analysis of data volumes scaling into the petabytes, delivering 10-100x performance improvements at a third of the cost of other options available from traditional database vendors.

TwinFin delivers high performance out of the box, with no indexing or tuning required. It is delivered ready-to-go for immediate data loading and query execution and integrates with all leading ETL, BI and analytic applications through standard ODBC, JDBC and OLE DB interfaces. TwinFin is easily rolled into your data center, plugged in and ready to load data in a matter of hours.

Experience it for yourself. Test drive TwinFin against your own data and queries to see what it can do for your business. Click here to find out more www.netezza.com/testdrive.

Key Product Highlights:
Industry-leading price-performance – 10-100x the performance at 1/3 the cost of competitive offerings
A platform for advanced analytics – Orders of magnitude faster performance
Scalability – From under 1 terabyte to petabytes
Support for thousands of users and very complex, mixed workloads
Commodity blade-based architecture
Industry-standard, multi-core Intel-based blades, implemented in combination with commodity disk storage and Netezza’s patented data filtering using Field Programmable Gate Arrays (FPGAs)
Appliance simplicity – No indexing or tuning; minimal ongoing administration
Industry-standard interfaces (SQL, ODBC, JDBC, OLE DB)
Full compatibility with market-leading BI tools, applications and infrastructure
Enterprise-class reliability and availability – More than 99.99% uptime
Green – Low power and cooling requirements in a compact footprint
Fast load speeds – Up to 2 TB/hour
Fast backup rates – High-speed backup and restore at data rates as high as 4 TB/hour

mercredi 21 octobre 2009

BO Timeline

from wikipedia

lundi 19 octobre 2009

The 10 Essential Rules of Dimensional Modeling

http://intelligent-enterprise.informationweek.com/showArticle.jhtml?articleID=217700810&pgno=1

The 10 Essential Rules of Dimensional Modeling
By Margy Ross

Rule #1: Load detailed atomic data into dimensional structures.

Carregar sempre o dado mais detalhado. Pois o usuario nao pode viver sem os detalhes.
Assim os dados podem ser sumarizados.

Rule #2: Structure dimensional models around business processes.

Business processes are the activities performed by your organization; they represent measurement events, like taking an order or billing a customer. Business processes typically capture or generate unique performance metrics associated with each event. These metrics translate into facts, with each business process represented by a single atomic fact table. In addition to single process fact tables, consolidated fact tables are sometimes created that combine metrics from multiple processes into one fact table at a common level of detail. Again, consolidated fact tables are a complement to the detailed single-process fact tables, not a substitute for them.

Rule #3: Ensure that every fact table has an associated date dimension table.

Obviamente. O grao deve ser um dia. As vezes multiplas chaves estrangeiras sao representadas em uma tabela fato.

Rule #4: Ensure that all facts in a single fact table are at the same grain or level of detail.

Rule #5: Resove many-to-many relationships in fact tables.

Rule #6: Resolve many-to-one relationships in dimension tables.

Rule #7: Store report labels and filter domain values in dimension tables.

Rule #8: Make certain that dimension tables use a surrogate key.

Rule #9: Create conformed dimensions to integrate data across the enterprise.

Rule #10: Continuously balance requirements and realities to deliver a DW/BI solution that's accepted by business users and that supports their decision-making.

dimanche 18 octobre 2009

Crossover cable

RJ45

Networking two machines
It costs very little to establish a 10BaseT network between two PCs because you don't need a hub. All you need is a network card for each machine and a special 10BaseT cable called a crossover cable. You plug one end of this cable into each machine's network card, as shown in Figure above.

Figure Above You don't need a hub to create a two-machine 10BaseT network.

vendredi 16 octobre 2009

DW 2.0 - Architecture for the Next Generation of Data Warehousing

to be checked

http://www.information-management.com/issues/20060401/1051111-1.html

samedi 10 octobre 2009

Business Intelligence: OLAP

OLAP

http://training.inet.com/OLAP/home.htm

http://en.wikipedia.org/wiki/Online_analytical_processing

OLAP Collections
http://altaplana.com/olap/olap.collections.html

MDX

http://www.databasejournal.com/features/mssql/article.php/10894_1495511_1/MDX-at-First-Glance-Introduction-to-SQL-Server-MDX-Essentials.htm

jeudi 13 août 2009

new dw bible bought

Comprei dia 20 chegou dia 24 de fevereiro.

Hj dia 25 estou terminando o capitulo 1.

capitulo 1:
Ja nao se usa os termos Data Staging e Data Marts.

mardi 2 juin 2009

what is new in BO Enterprise XI 3.1

Client tools
• BO Enterprise Java InfoView = the old infoview webi but now comes with Dashboard Builder and PM. I think.
• Dashboard Builder
• Performance Manager

• Web Intelligence Rich Client
the old web intelligence enhanced.

• Desktop Intelligence = the old Business Objects Full Client Application

• Central Management Console (CMS) = the old supervisor.

• Report Conversion Tool
convert BO Desktop Intelligence docs (*.rep) >>>>>>>>> Web Intelligence docs (*.wid)

• Universe Designer
the old Designer. Now it comes with an option on the menu called "query panel", where you can see the query to be built.

• Data Source Migration Wizard
• Business Views Manager
• Report Comparison Tool
• Import Wizard
• Publishing Wizard
• Query as a Web Service
• Translation Manager
• Universe Builder
• Universe Connection Manager
• Crystal Reports
• Voyager
• Set Analyzer
• Integration Kit for SAP
• Integration Kit for Peoplesoft
• Live Office

• Central Configuration Manager = gathers all services (start/stop) related to Business Objects like windows services.
The services are:
• Apache Tomcat
• WinHTTP Web Proxy Auto-Discovery Service
• World Wide Web Publishing Service

vendredi 6 février 2009

BO designer: Divergences in Check cardinalities

In my opinion we may have divergences in BO designer during the cardinalities check when:

We have multilanguage selection during the query. This table has two primary keys, the code and the language. For this, bo thinks there are two rows for each N row in another table, so it thinks you have a N by N relationship. We don't select the code at this point because the user will select when creating the query:

*the part circled was introduced by me after the checking. if I hit detect button it says ok.
clang='EN'

, the error is :"cardinality is not valid" because i forced 1 by N relationship and BO checks and see N by N relationship due to the lack of the language.

lundi 2 février 2009

Free lancer

mercredi 28 janvier 2009

windows services explained

http://www.theeldergeek.com/alerter.htm

BO 6.5 Tracing file - ultimate tracing

Put this code into a bo_trace.ini file:

Code:

active=true;
importance='<<'; size=100000; keep=true; Put it on a tmp directory and create this environment variables (these have my tmp directories): BO_TRACE_CONFIGDIR -> C:\temp\TraceLog
BO_TRACE_CONFIGFILE -> C:\temp\TraceLog\BO_trace.ini
BO_TRACE_LOGDIR -> C:\temp\TraceLog

to disable you need to put the webi service down