본문 바로가기

Database

[HANA] Delta merge -1 (internal)

→ 여기서 부터는 
    "논문에서만 존재하고 구현된 적이 없는 내용입니다. !!!!!"
     
     by SAP HANA Lab ('19.4.8 conference call)


 

 

■ Life-cycle Management of Records

Three stage of physical representation

  • L1-Delta
    • Accepts all incoming data requests

    • Store records in row format (write-optimized)
      • Fast insert & delete
      • Fast field update
      • Fast record projection
    • No data compression
    • Holds 10,000 to 100,000 per single-node

  • L2-Delta
    • Accepts bulk data
    • The seconds stage of the record life cycle
    • Store records in column format
    • unsorted dictionary
      • requiring secondary index structures to optimally support point query access patterns
    • Well suited to store up to 10 million

  • Main
    • Final data format
    • Stores records in column format
    • Highest compression rate
      • Sorted dictionary
      • Positions in dictionary stored in a bit-packed manner
      • The dictionary is also compressed

 

 

■ Unified Table Access

  • Unified Table Access
    • A common abstract interface to access different stores
    • Records are propagated asynchronously
      • without interfering with running operations
    • Two Transformations (or merge steps)
      • L1-delta to L2-delta
      • L2-delta to main

 

  • Merge from L1-delta to L2-delta
    • Rows format to column format conversion

    • column-by-column inserted into the L2-delta

    • Steps

      • Step-1 (Parallel) : appends new entries to the dictionary

      • Step-2 (Parallel) : column values are added using the dictionary encoding

      • Step-3                 : propagated entries removed from the L1-delta

    • Needs no reconstruction of L2-delta structures
      • just appends entries to the unsorted dictionary
    • This merge can be incremental
    • Minimal influence to the running transactions

 

  • Merge from L2-delta to Main (보통 우리가 얘기하는 Delta-Merge)
    • Resource intensive task

      • A new main structure is created out of the L2-delta and the existing main
      • Should be carefully scheduled and highly optimized
    • Must be a complete merge
      • The old L2-delta is closed and a new one is created
      • Retries the merge on failure
    • Details of L2-delta-to-main merge (Classic한 방식)
    •  
        

 

 

■ Life-cycle Management of Records

  • Merge Optimization
    • Using the Main store's dictionary
      • Dictionary가 이미 구성된 경우
      • Dictionary의 merge가 일어나지 않으므로 유리 
        • Column = 국가코드, 년도, 우편번호 등인 경우
        • Dictionary merge가 필요 없으므로 Main store의 attribute가 변경되지 않음
    • 새로운 Dictionary의 값이 기존의 Dictionary보다 큰 경우
      • Dictionary가 날짜/시간 등인 경우
      • Dictionary의 merge는 발생하나 Delta-merge의 dictionary가 Main의 dictionary 아래에 위치하는 것으로 간단히 Merge
        • Main store의 attribute가 변경되지 않음
    • Single Column Merge
      • 전체 테이블 단위가 아닌 단일 컬럼 단위로 Merge를 수행

 

    • Re-sorting Merge
      • Compression 최적화를 위해 전체 Tuple에 대해 Sorting을 수행할 수 있음 
        • Individual columns are re-sorted to gain higher compression rate
      • Address 기반 구조이므로 Mapping structure에 의해서 구현 가능
      •  
  •  
    • Partial Merge
      • 큰 테이블에 대한 Merge의 overhead 줄이기 위함

      • Main을 2개의 독립 구조로 분리하여 처리
        • Passive Main
          • not part of merge process
        • Active Main
          • takes part of the merge process with the L2-delta
          • only holds new values not in the passive main
        • Accesses are resolved in both dictionaries and parallel scans are performed on both structure (당연한 얘기)