Join two XML files using XSL Transformation

This blog demonstrates how you can JOIN (yes, like in SQL) two xml (datasets) to one using a common ID (relationship).

shop.xml


<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type='text/xsl' href='main.xsl'?>
<shop>
<product>
<id>100</id>
<title>hello</title>
</product>

<product>
<id>101</id>
<title>world</title>
</product>
<product>
<id>102</id>
<title>praveen</title>
</product>
</shop>

price.xml

<?xml version="1.0" encoding="UTF-8"?>
<shop>
<product>
<id>100</id>
<price>10.0</price>
</product>
<product>
<id>101</id>
<price>10.1</price>
</product>
<product>
<id>102</id>
<price>10.2</price>
</product>
<product>
<id>103</id>
<price>10.3</price>
</product>

</shop>

main.xsl


<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />

<xsl:param name="fileName" select="'price.xml'" />

<xsl:variable name="updateItems" select="document($fileName)/shop/product" />

<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()" />
<xsl:copy-of select="$updateItems[id=current()/id]/price" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Resulting XML, after transformation should look like:


<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type='text/xsl' href='main.xsl'?>
<shop>
<product>
<id>100</id>
<title>hello</title>
<price>10.0</price>
</product>
<product>
<id>101</id>
<title>world</title>
<price>10.1</price>
</product>
<product>
<id>102</id>
<title>praveen</title>
<price>10.2</price>
</product>
</shop>

Bonus code, if some .net developers want a transformation code:


XslTransform xslt = new XslTransform();
xslt.Load(@"D:\Websites\xmltest\main.xsl");
xslt.Transform(@"D:\Websites\xmltest\shop.xml", @"D:\Websites\xmltest\out.xml");
textBox1.Text = System.IO.File.ReadAllText(@"D:\Websites\xmltest\out.xml");

SSAS: Dimension Relationships in Cubes

“Dimension relationship” refers to the direct or indirect relationships between dimension and its measure groups in a Cube.

Regular Refers to a standard relationship, when a Key column in the dimension is directly joined to fact table.
Reference When a Key column in the dimension is indirectly joined to fact table by referencing another dimension.
Fact / Degenerate Dimensions constructed from attribute columns in fact tables than from attribute columns in dimension tables.
Many-to-Many One dimension is associated with multiple facts

Read more: https://docs.microsoft.com/en-us/sql/analysis-services/multidimensional-models-olap-logical-cube-objects/dimension-relationships?view=sql-server-2017

Note: My study notes

Shu-Ha-Ri technique of Learning

Shu Ha Ri is a learning model, or technique where at first (Shu) he/she follows a master and does the activities without knowing the why factor. He follows only one way of doing an activity even though there are different and efficient ways to accomplish same. Later (Ha) he learns more about the underlying details and starts to learn from different sources or masters and starts to do activities more efficiently. At the final stage (Ri) he starts to think of their own and builds his on ways of doing things within his comfort zone.

Common RAID levels explained

  • RAID 0 – Disk Striping

– Used for the storage of noncritical items but which requires fast read-write.
– Does not have parity (parity is about checking whether the data has been lost or overwritten on transition)
– Does not have redundancy or fault tolerance. i.e., when the drive dies, the data is lost.

  • RAID 1 – Disk Mirroring

– Used usually for OS, SQL Engine etc. installation
– Two or more disks used to write data and is in parallel
– High performance
– High availability
– No data loss on disk failure

  • RAID 2
  • RAID 3
  • RAID 4
  • RAID 5 – Striping with Parity

Requires 3-16 drives
No data loss on disk failure
Read is faster but write can be slower
Failures can impact throughput

  • RAID 6 – Striping with Double Parity

Requires min. 4 drives.
Two drives are used for storing parity data
Read is faster but write can be slower than RAID 5
More secure than RAID 5

  • RAID 7
  • RAID 10 / RAID 1+0 – Striped Set of Mirrors

10 means combining 1 and 0, and not “ten”
Combines disk mirroring and disk striping
Requires minimum 4 disks
Best choice for I/O intensive applications

Note: Blog incomplete. Will be updated.
Note: My learning notes, source: Internet

Programming Puzzle #2 – Leet Converter

Write a program in a computer language of your choice to convert any given text to “leet format” in real time.

Leet (or “1337”), is a system of modified spellings used primarily on the Internet.

Input: “Translator” Output:”Tr4nsl4t0r”
Input: “leet”, Output: “l33t”
Input: “Good Morning”, Output: “G00d M0rn1ng”

Evaluation criteria:

  1. Code Quality Standards
  2. OOAD/Object-Oriented Analysis & Design
  3. Application Logic
  4. Exception Handling
  5. Simplicity and Effectiveness of code

Time: 0-30 minutes max.

Programming Puzzle #1–Find the critical path

Write a program in a language of your choice to find the critical path from a given set of tasks.

A critical path is determined by identifying the longest stretch of dependent activities and measuring the time required to complete them from start to finish.

image

Each circle (A-G) are tasks with specific duration (in Hours).

Input:

Array of task names and duration given in the diagram.

Output

1. Longest path (Critical path) is A+G+B+F+C+D (42Hrs)
2. Shortest path is A+B+C+D (26 Hrs)

Why Cosmos DB may not be apt for building Data Warehouse?

Well, the question is slightly wrong until the context is specified because it is possible to build Modern Data Warehouse by including Cosmos DB in the architecture. This is too much relevant today because the data is no more straight forward content with human readable entities and relations (structured), but unstructured and/or streaming too. Also the pace of the data flow, or business requirement is becoming near real-time.

See a reference architecture below:

c1Image Source: MS Docs

Here, in this blog, the context is about Traditional Data Warehouse possibility, where you will be modelling the data, specifying relationships, etc. Let us look at the definition of Data Warehouse mentioned in Oracle Docs:

“A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing.”

Now let us ask the right question – Why Cosmos DB may not be apt for using as a data store in a Data Warehouse? – It is not apt, because, Cosmos DB is a NoSQL database where it is literally not easy to draw relationships between entities/tables/data. Check what MSDN blog said about this:

“Cosmos DB is not a relational database. You cannot just take your relational database and expect it to run in Cosmos DB. You could move tables of data into Cosmos, but not the relational aspects of your existing data structures.”

As of today, this is the conclusion. But we cannot say tomorrow what will happen to these concepts because Cosmos DB is becoming powerful and I am already in love with it.

You can read common scenarios (use cases) where you can use, or the companies use Cosmos DB here.

Do you have different thoughts on this? Please comment.