FINALTERM EXAMINATION
Spring 2010
CS614- Data Warehousing (Session - 3)
Time: 90 min
Marks: 60
Question No: 1 ( Marks: 1 ) - Please choose one
► Legacy systems
► Only internal data sources
► Privacy restrictions
► Small data mart
Question No: 2 ( Marks: 1 ) - Please choose one
► Data Retrieval
► Data Modification
► Development Cycle
► Data Replication
Question No: 3 ( Marks: 1 ) - Please choose one
► Common Column Values
► Common Row Values
► Different Index Values
► Value resulted by ad-hoc query
Question No: 4 ( Marks: 1 ) - Please choose one
► File
► Application
► Aggregate
► Database
Question No: 5 ( Marks: 1 ) - Please choose one
► One
► Two
► lg (n)
► n
Question No: 6 ( Marks: 1 ) - Please choose one
I An Abstraction
II A Representation
Which of the following option is true?
► I Only
► II Only
► Both I & II
► None of I & II
Question No: 7 ( Marks: 1 ) - Please choose one
► Pipeline Parallelism
► Overlapped Parallelism
► Massive Parallelism
► Distributed Parallelism
Question No: 8 ( Marks: 1 ) - Please choose one
► Skew in Partition
► Pipeline Distribution
► Distributed Distribution
► Uncontrolled Distribution
Question No: 9 ( Marks: 1 ) - Please choose one
► None of these
► Sequentially
► In Parallel
► Distributed
Question No: 10 ( Marks: 1 ) - Please choose one
► (a + b)M
► (a - b)M
► (a + b + M)
► (a * b * M)
Question No: 11 ( Marks: 1 ) - Please choose one
► Exploratory
► Non-Exploratory
► Computer Science
Question No: 12 ( Marks: 1 ) - Please choose one
► OLTP
► OLAP
► DSS
► DWH
Question No: 13 ( Marks: 1 ) - Please choose one
► Clustering
► Aggregation
► Segmentation
► Partitioning
Question No: 14 ( Marks: 1 ) - Please choose one
► Pearson correlation is the only technique
► Euclidean distance is the only technique
► Both Pearson correlation and Euclidean distance
► None of these
Question No: 15 ( Marks: 1 ) - Please choose one
► One-way Clustering
► Bi-clustering
► Pearson correlation
► Euclidean distance
Question No: 16 ( Marks: 1 ) - Please choose one
► Designing
► Development
► Analysis
► Implementation
Question No: 17 ( Marks: 1 ) - Please choose one
► Tools
► Industry
► Software
► None of these
Question No: 18 ( Marks: 1 ) - Please choose one
► Increasing
► Decreasing
► Maintaining
► None of these
Question No: 19 ( Marks: 1 ) - Please choose one
► Silver Bullet
► Golden Bullet
► Suitable Hardware
► Compatible Product
Question No: 20 ( Marks: 1 ) - Please choose one
► Rebuilding
► Success
► Good Stable Product
► None of these
Question No: 21 ( Marks: 1 ) - Please choose one
► Cotton-growing
► Rice-growing
► Weapon Producing
Question No: 22 ( Marks: 1 ) - Please choose one
► Data profiling
► Data Anomaly Detection
► Record Duplicate Detection
► None of these
Question No: 23 ( Marks: 1 ) - Please choose one
► Only One Direction
► Any Direction
► Two Direction
► None of these
Question No: 24 ( Marks: 1 ) - Please choose one
► True
► False
Question No: 25 ( Marks: 1 ) - Please choose one
► The lack of data integration and standardization
► Missing Data
► Data Stored in Heterogeneous Sources
Question No: 26 ( Marks: 1 ) - Please choose one
► OLE DB
► OLAP
► OLTP
► Data Warehouse
Question No: 27 ( Marks: 1 ) - Please choose one
► Tools
► Documentations
► Guidelines
Question No: 28 ( Marks: 1 ) - Please choose one
► Committed to the database
► Rolled back
Question No: 29 ( Marks: 1 ) - Please choose one
► Execution of package
► Creation of package
► Connection of package
Question No: 30 ( Marks: 1 ) - Please choose one
► One before Extraction and the other after Extraction
► One before Transformation and the other after Transformation
► One before Loading and the other after Loading
Question No: 31 ( Marks: 2 )
Question No: 32 ( Marks: 2 )
Value validation is the process of ensuring that each value that is sent to the data
warehouse is accurate.
Question No: 33 ( Marks: 2 )
Question No: 34 ( Marks: 2 )
Question No: 35 ( Marks: 3 )
- Waterfall model
- RAD model
- Spiral Model
Question No: 36 ( Marks: 3 )
1. Web data is unstructured and dynamic, Keyword search is insufficient.
2. Web log contain wealth of information as it is a key touch point.
3. Shift from distribution platform to a general communication platform.
Question No: 37 ( Marks: 3 )
- Providing connectivity to different databases
- Building query graphically
- Extraction data from disparate databases
- Transforming data
- Copying database objects
- Providing support of different scripting languages (by default VB-script and Java –
Question No: 38 ( Marks: 3 )
Problems with reading a log/journal tape are many:
- Contains lot of extraneous data
- Format is often arcane
- Often contains addresses instead of data values and keys
- Sequencing of data in the log tape often has deep and complex
- implications
- Log tape varies widely from one DBMS to another.
Question No: 39 ( Marks: 5 )
SQL Server Data Transformation Services (DTS) is a set of graphical
tools and programmable objects that allow you extract, transform, and consolidate data from disparate sources into single or multiple destinations. SQL Server Enterprise .Manager provides an easy access to the tools of DTS.
Question No: 40 ( Marks: 5 )
Ans:
The DWH development lifecycle (Kimball’s Approach)
has three parallel tracks emanating from requirements definition.
These are
- technology track,
- data track and
- Analytic applications track.
Analytic Applications Track:
Analytic applications also serve to encapsulate the analytic expertise of
the organization, providing a jump-start for the less analytically inclined.
It consists of two phases.
- Analytic applications specification
- Analytic applications development
Analytic applications specification:
The main features of Analytic applications specification are:
- Starter set of 10-15 applications.
- Prioritize and narrow to critical capabilities.
- Single template use to get 15 applications.
- Set standards: Menu, O/P, look feel.
- From standard: Template, layout, I/P variables, calculations.
- Common understanding between business & IT users.
Following the business requirements definition, we need to review the findings and collected sample reports to identify a starter set of approximately 10 to 15 analytic applications. We want to narrow our initial focus to the most critical capabilities so that we can manage expectations and ensure on-time delivery. Business community input will be critical to this prioritization process. While 15 applications may not sound like much,
Before designing the initial applications, it's important to establish standards for the applications, such as
- common pull-down menus and
- Consistent output look and feel.
Using the standards, we specify each application
- template,
- capturing sufficient Information about the layout,
- input variables,
- calculations, and
- breaks
so that both the application developer and business representatives share a common understanding.
During the application specification activity, we also must give consideration to the organization of the applications. We need to identify structured navigational paths to access the applications, reflecting the way users think about their business. Leveraging the Web and customizable information portals are the dominant strategies for disseminating application access.
Analytic applications development:
The main features of Analytic applications development consisits of:
- Standards: naming, coding, libraries etc.
- Coding begins AFTER DB design complete, data access tools installed,
subset of historical data loaded.
- Tools: Product specific high performance tricks, invest in tool-specific
education.
- Benefits: Quality problems will be found with tool usage => staging.
- Actual performance and time gauged.
When we do work into the development phase for the analytic applications, we again need to focus on standards. Standards for
- naming conventions,
- calculations,
- libraries, and
- coding
should be established to minimize future rework. The application development
activity can begin once the database design is complete, the data access tools and metadata are installed, and a subset of historical data has been loaded. The application template specifications should be revisited to account for the http://www.allvupastpapers.blogspot.com/
inevitable changes to the data model since the specifications were completed.
We should take approperiate-specific education or supplemental resources
for the development team.
While the applications are being developed, several ancillary benefits result. Application developers, should have a robust data access tool, quickly will find needling problems in the data haystack despite the quality assurance performed by the staging application. we need to allow time in the schedule to
address any flaws identified by the analytic applications.
After realistically test query response times developers now reviewing performance-tuning strategies. The application development quality-assurance activities cannot be completed until the data is stabilized. We need to make sure that there is adequate time in the schedule beyond the final data staging cutoff to allow for an orderly wrap-up of the application development tasks.
No comments:
Post a Comment
PLEASE COMMENT ABOUT YOUR VISIT AND MY SITE
Note: Only a member of this blog may post a comment.