Jun 25, 2024

Data Warehouse

 

What is DATA MODELLING?
Data Modelling or Data Architecture or Dimension Modelling is nothing but creating a blueprint for how data will be organized and stored in the data warehouse. This process involves identifying the entities, attributes, and relationships between entities in the data, and then designing tables and views to represent them. We can model the data by two techniques i.e.,
1. Star Schema
2. Snowflake Schema
Before jumping into these, we need to understand what is dimension table and what is fact table?

DIMENSION TABLE-
A table which stores descriptive attribute, is non- measurable and categorical in nature is called a dimension table.
FACT TABLE-


A fact table is a central table that stores measurable, aggregate, quantitative or factual data about a particular subject area.
Example-


Consider an E-Commerce application, which will have attributes like
Products, Sales, Tax, Customer, Discount.
For above scenario, Products could behave as Dimension table.
And Sales could behave as Fact table.

STAR SCHEMA-


In the previous diagram, the fact table is in the center and the dimension table is in a relationship with it which makes a star like structure hence, this is called the star schema.


SNOWFLAKE SCHEMA-
Snowflake schema is a variation of the star schema that uses multiple layers of dimension tables. This can be useful for complex data relationships.

Standard naming convention-
● A common prefix for fact tables is "FACT_" or "FT_". This prefix helps
distinguish fact tables from dimension tables.
Eg- fact_Sales or fact_Tax
● A common prefix for dimension tables is "DIM_" or "D_". This prefix helps
distinguish dimension tables from fact tables.
Eg- dim_Products, dim_Customers or dim_Discounts

Types of Fact Table-
1. Transaction fact tables: Theystore detailed information about individual business transactions or events. They record every occurrence at the most granular level, providing a comprehensive view of operational data.
2. Periodic snapshot tables: Periodic snapshot tables provide a summarized view of metrics over regular time intervals. They store aggregated data at a specific point in time, such as the end of a day, week, or month.


3. Accumulating snapshot tables: Accumulating snapshot tables track the stages of a business process or workflow. They store data at specific
checkpoints within a process, providing a detailed view of how the process unfolds over time.

Types of Dimension Table-


● Slowly Changing Dimension (SCD) Tables: It store information that rarely changes over time. They typically contain master data or lookup information, such as product codes, customer IDs, or geographic codes. There are four
main types of SCD tables:
a. SCD Type 0: Static and does not changes.
b. SCD Type 1: Overwrite the previous field, doesn’t keep history.
c. SCD Type 2: Add a new history table.
d. SCD Type 3: Add a new column to keep history.


● Conformed Dimension Tables: Conformed dimension tables are
standardized dimension tables that are shared across multiple fact tables or subject areas.

● Degenerate Dimension Tables: Degenerate dimension tables are dimension tables that are embedded within fact tables.

● Junk Dimension Tables: Junk dimension tables are used to group together disparate dimension attributes that do not fit neatly into other dimension tables as they have low cardinality.

● Role Playing Dimension: They are a type of dimension table that can
assume different meanings or roles depending on the context of the analysis. They are often used to represent entities that can play multiple roles in a business process.

● Static Dimension Table: Static dimension tables are a type of table that
stores descriptive attribute data that does not change over time.

● Shrunken Dimension Tables: Shrunken dimension tables are dimension tables that contain a subset of the attributes from a larger dimension table.

 


Sep 22, 2023

Bitcoin Halving

 Bitcoin halving is a significant event on the Bitcoin network every four years. During this event, the block reward that miners receive for verifying transactions and adding new blocks to the blockchain is reduced by 50%. This means that the rate of new Bitcoin creation slows down, and the total supply of Bitcoin approaches its maximum limit of 21 million.


Bitcoin halving is a programmed event and is built into the Bitcoin protocol to ensure that the inflation rate of Bitcoin remains controlled and predictable. The reduced rate of new Bitcoin creation and the expectation of scarcity can increase the value of Bitcoin, which has historically led to an increase in the asset's price in the months leading up to a halving event.
Despite this, the market can be unpredictable, and the impact of halving Bitcoin's price is not guaranteed. However, the reduced supply of Bitcoin resulting from halving helps to maintain its value and ensure that it remains a finite and scarce asset.

The previous Bitcoin halving occurred on May 11, 2020, at a block height of 630,000. At that time, the block reward for miners was reduced from 12.5 BTC to 6.25 BTC per block. This was the third halving event in Bitcoin's history, following the first halving in November 2012 and the second halving in July 2016. The next Bitcoin halving is expected to occur in march 2024, at which point the block reward will be reduced from 6.25 BTC to 3.125 BTC per block.

After the first Bitcoin halving in November 2012, the price of Bitcoin increased by over 8,000% over the following year. After the second halving in July 2016, the price of Bitcoin increased by around 2,500% over the following 18 months. After the most recent halving event in May 2020, the price of Bitcoin initially experienced a slight drop but quickly recovered and went on to gain over 300% in value over the following year, reaching an all-time high of over $64,000 in April 2021.

Sep 7, 2023

Top 5 AWS Interview Tips

 

  • Can you describe the current AWS infrastructure and technologies used within the company?
    • This question shows your eagerness to understand the existing AWS setup and gives you insights into the company's tech stack.
  • What AWS services or tools are particularly important for this role, and how are they utilized here?
    • This question allows you to gauge the specific responsibilities of the role and how AWS is integrated into the company's operations.
  • How does the company handle AWS security and compliance, especially in relation to [mention relevant industry standards or regulations]?
    • Demonstrates your concern for security and regulatory compliance, which is crucial in many AWS roles.
  • What are the biggest challenges or projects related to AWS that the team is currently working on or will be working on in the near future?
    • Shows your interest in contributing to the team's objectives and your willingness to tackle challenges.

Sep 6, 2023

Database vs. Data Warehouse vs. Data Lake

In the world of data management, these three concepts play distinct roles in handling and organizing data. Let's explore their key differences:


📊 Database:
A database is a structured collection of data that is organized, stored, and managed using a predefined schema. It's designed for efficient data retrieval and modification. Databases are used to support transactional operations(OLTP), such as recording customer orders, tracking inventory, and managing user profiles.

🏢 Data Warehouse:
A data warehouse is a centralized repository that aggregates data from various sources across an organization. It's optimized for complex queries and data analysis(OLAP). Data warehouses often use a process called Extract, Transform, Load (ETL) to integrate data from different systems, transform it into a consistent format, and load it into the warehouse. They are used for business intelligence, reporting, and decision-making.

🌊 Data Lake:
A data lake is a vast storage repository that holds both structured and unstructured data at any scale. Unlike traditional databases, data lakes don't require a predefined schema. This flexibility allows organizations to store raw, unprocessed data from various sources, making it suitable for advanced analytics, machine learning, and exploration of new data sources(ELT)

💡 Key Takeaways:
Use Cases: Databases are used for day-to-day operations, data warehouses are for business intelligence, and data lakes are for storing and analyzing large volumes of diverse data.

Schema: Databases have a fixed schema, data warehouses have a structured schema for reporting, and data lakes allow schema-on-read for flexibility.

Data Types: Databases store structured data, data warehouses store structured and semi-structured data, and data lakes store structured, semi-structured, and unstructured data.

May 26, 2023

Solidity - Block Chain 3

 Internal and External

In addition to public and private, Solidity has two more types of visibility for functions: internal and external.

internal is the same as private, except that it's also accessible to contracts that inherit from this contract. (Hey, that sounds like what we want here!).

external is similar to public, except that these functions can ONLY be called outside the contract — they can't be called by other functions inside that contract. We'll talk about why you might want to use external vs public later.

For declaring internal or external functions, the syntax is the same as private and public:

contract Sandwich {
  uint private sandwichesEaten = 0;

  function eat() internal {
    sandwichesEaten++;
  }
}

contract BLT is Sandwich {
  uint private baconSandwichesEaten = 0;

  function eatWithBacon() public returns (string memory) {
    baconSandwichesEaten++;
    // We can call this here because it's internal
    eat();
  }
}Interacting with other contracts

For our contract to talk to another contract on the blockchain that we don't own, first we need to define an interface.

Let's look at a simple example. Say there was a contract on the blockchain that looked like this:

contract LuckyNumber {
  mapping(address => uint) numbers;

  function setNum(uint _num) public {
    numbers[msg.sender] = _num;
  }

  function getNum(address _myAddress) public view returns (uint) {
    return numbers[_myAddress];
  }
}

This would be a simple contract where anyone could store their lucky number, and it will be associated with their Ethereum address. Then anyone else could look up that person's lucky number using their address.

Now let's say we had an external contract that wanted to read the data in this contract using the getNum function.

First we'd have to define an interface of the LuckyNumber contract:

contract NumberInterface {
  function getNum(address _myAddress) public view returns (uint);
}

Notice that this looks like defining a contract, with a few differences. For one, we're only declaring the functions we want to interact with — in this case getNum — and we don't mention any of the other functions or state variables.

Secondly, we're not defining the function bodies. Instead of curly braces ({ and }), we're simply ending the function declaration with a semi-colon (;).

So it kind of looks like a contract skeleton. This is how the compiler knows it's an interface.

By including this interface in our dapp's code our contract knows what the other contract's functions look like, how to call them, and what sort of response to expect.

Using an Interface

contract NumberInterface {
  function getNum(address _myAddress) public view returns (uint);
}

We can use it in a contract as follows:

contract MyContract {
  address NumberInterfaceAddress = 0xab38...
  // ^ The address of the FavoriteNumber contract on Ethereum
  NumberInterface numberContract = NumberInterface(NumberInterfaceAddress);
  // Now `numberContract` is pointing to the other contract

  function someFunction() public {
    // Now we can call `getNum` from that contract:
    uint num = numberContract.getNum(msg.sender);
    // ...and do something with `num` here
  }
}

In this way, your contract can interact with any other contract on the Ethereum blockchain, as long they expose those functions as public or external.

Handling Multiple Return Values


function multipleReturns() internal returns(uint a, uint b, uint c) {
  return (1, 2, 3);
}

function processMultipleReturns() external {
  uint a;
  uint b;
  uint c;
  // This is how you do multiple assignment:
  (a, b, c) = multipleReturns();
}

// Or if we only cared about one of the values:
function getLastReturnValue() external {
  uint c;
  // We can just leave the other fields blank:
  (,,c) = multipleReturns();
}

If statements

If statements in Solidity look just like JavaScript:

function eatBLT(string memory sandwich) public {
  // Remember with strings, we have to compare their keccak256 hashes
  // to check equality
  if (keccak256(abi.encodePacked(sandwich)) == keccak256(abi.encodePacked("BLT"))) {
    eat();
  }
}

Immutability of Contracts

 Solidity has looked quite similar to other languages like JavaScript. But there are a number of ways that Ethereum DApps are actually quite different from normal applications.

To start with, after you deploy a contract to Ethereum, it’s immutable, which means that it can never be modified or updated again.

The initial code you deploy to a contract is there to stay, permanently, on the blockchain. This is one reason security is such a huge concern in Solidity. If there's a flaw in your contract code, there's no way for you to patch it later. You would have to tell your users to start using a different smart contract address that has the fix.

But this is also a feature of smart contracts. The code is law. If you read the code of a smart contract and verify it, you can be sure that every time you call a function it's going to do exactly what the code says it will do. No one can later change that function and give you unexpected results.

External dependencies

we hard-coded the CryptoKitties contract address into our DApp. But what would happen if the CryptoKitties contract had a bug and someone destroyed all the kitties?

It's unlikely, but if this did happen it would render our DApp completely useless — our DApp would point to a hardcoded address that no longer returned any kitties. Our zombies would be unable to feed on kitties, and we'd be unable to modify our contract to fix it.

For this reason, it often makes sense to have functions that will allow you to update key portions of the DApp.

For example, instead of hard coding the CryptoKitties contract address into our DApp, we should probably have a setKittyContractAddress function that lets us change this address in the future in case something happens to the CryptoKitties contract.

Ownable Contracts

setKittyContractAddress is external, so anyone can call it! That means anyone who called the function could change the address of the CryptoKitties contract, and break our app for all its users.

We do want the ability to update this address in our contract, but we don't want everyone to be able to update it.

To handle cases like this, one common practice that has emerged is to make contracts Ownable — meaning they have an owner (you) who has special privileges.

OpenZeppelin's Ownable contract

Below is the Ownable contract taken from the OpenZeppelin Solidity library. OpenZeppelin is a library of secure and community-vetted smart contracts that you can use in your own DApps. After this lesson, we highly recommend you check out their site to further your learning!

Give the contract below a read-through. You're going to see a few things we haven't learned yet, but don't worry, we'll talk about them afterward.

/**
 * @title Ownable
 * @dev The Ownable contract has an owner address, and provides basic authorization control
 * functions, this simplifies the implementation of "user permissions".
 */
contract Ownable {
  address private _owner;

  event OwnershipTransferred(
    address indexed previousOwner,
    address indexed newOwner
  );

  /**
   * @dev The Ownable constructor sets the original `owner` of the contract to the sender
   * account.
   */
  constructor() internal {
    _owner = msg.sender;
    emit OwnershipTransferred(address(0), _owner);
  }

  /**
   * @return the address of the owner.
   */
  function owner() public view returns(address) {
    return _owner;
  }

  /**
   * @dev Throws if called by any account other than the owner.
   */
  modifier onlyOwner() {
    require(isOwner());
    _;
  }

  /**
   * @return true if `msg.sender` is the owner of the contract.
   */
  function isOwner() public view returns(bool) {
    return msg.sender == _owner;
  }

  /**
   * @dev Allows the current owner to relinquish control of the contract.
   * @notice Renouncing to ownership will leave the contract without an owner.
   * It will not be possible to call the functions with the `onlyOwner`
   * modifier anymore.
   */
  function renounceOwnership() public onlyOwner {
    emit OwnershipTransferred(_owner, address(0));
    _owner = address(0);
  }

  /**
   * @dev Allows the current owner to transfer control of the contract to a newOwner.
   * @param newOwner The address to transfer ownership to.
   */
  function transferOwnership(address newOwner) public onlyOwner {
    _transferOwnership(newOwner);
  }

  /**
   * @dev Transfers control of the contract to a newOwner.
   * @param newOwner The address to transfer ownership to.
   */
  function _transferOwnership(address newOwner) internal {
    require(newOwner != address(0));
    emit OwnershipTransferred(_owner, newOwner);
    _owner = newOwner;
  }
}

A few new things here we haven't seen before:

  • Constructors: constructor() is a constructor, which is an optional special function. It will get executed only one time, when the contract is first created.
  • Function Modifiers: modifier onlyOwner(). Modifiers are kind of half-functions that are used to modify other functions, usually to check some requirements prior to execution. In this case, onlyOwner can be used to limit access so only the owner of the contract can run this function. We'll talk more about function modifiers in the next chapter, and what that weird _; does.
  • indexed keyword: don't worry about this one, we don't need it yet.

So the Ownable contract basically does the following:

  1. When a contract is created, its constructor sets the owner to msg.sender (the person who deployed it)

  2. It adds an onlyOwner modifier, which can restrict access to certain functions to only the owner

  3. It allows you to transfer the contract to a new owner

onlyOwner is such a common requirement for contracts that most Solidity DApps start with a copy/paste of this Ownable contract, and then their first contract inherits from it.

Since we want to limit setKittyContractAddress to onlyOwner, we're going to do the same for our contract.