Image Repo Layout


#1

Some initial thoughts on image repo layout for discussion

Layout:

  • Whole images of all the required types should be generated and stored in the repo

  • Images should be spliced and portions extracted so as to isolate specific FS
    units (tables, trees, attributes, etc)

  • images/

  • base/ basic images to test general components

    • simple/
    • localdev/
    • invalid/
  • combined/ images w/ combined features of others (multiple partitions, fs’, etc)

  • virt/ images based on specific virt technologies

    • qcow/
    • vmware/
    • rhevm/
    • ms/
  • structures/

    • part/ data structures related to specific partition types
    • dos/
      • mbrs/
      • partitions/
      • extended/
    • gpt/
      • headers/
      • partitions/
  • fs/ data structures related to specific filesystem types

    • ext3/
      • superblocks/
      • inodes/
      • groups/
      • directories/
    • ext4/
      • trees/
    • fat/
      • boot_sectors/
      • directories/
      • tables/
    • iso9660/
    • ntfs/
      • attributes/
      • files/
      • mfts/
  • reiserfs/

  • xfs/

    • superblocks/
    • inodes/
    • directories/
    • trees/
  • application/ images with application specific content

    • systems/ operating systems
    • services/ sysvinit & systemd services
    • packages/ package system content
    • userdata/ files and specific content
  • meta/ additional image & repo related metadata content


Strategy:
Write small tasks / utility library to:

  • Generate filesystem layout locally
  • Generate specified disk images, parameterized, making use of some sort of image descriptor representation (templates?)
  • Extract specific content from disk images and write to fs
  • Sync image content on remote server from local fs
  • Sync image content in local fs from remote server
  • Optionally generate image metadata & repo metadata to be pulled into other applications

#2

We already have some disk files that I was using for MiqDisk tests. I can probably just flip those over to the new repo format and transfer those existing tests.

Do we have a remote server for the images? I wasn’t sure that was on the scope for the short term (it’s technically not needed on phase 1 if we are camcordering the tests)


#3

Here is where I use the disk test files. It requires a mount to the Scratch directory (old ManageIQ file share that’s not public). We can take those files and move them wherever the final place is, then tweak these tests (convert to RSpec camcorders?) to use the new location. This might a good example to prove out the whole thing. cc @rpo

https://github.com/ManageIQ/manageiq/blob/master/lib/test/DiskTestCommon/tc_MiqDisk.rb#L9-L23


#4

Thanks, will look into these. Obviously anything will need to be integrated into rspec to be of use in the new test setup, but perhaps some of this could be migrated over.


#5

I’m thinking the repository should provide a coarse-grained organization of the image files within its tree structure. Additional attributes could then be defined for each image through some tag-like mechanism. The implementation of this tagging mechanism could be simple (a yaml file and ruby script for example) which would be stored and maintained in the repository.


Given the hierarchical nature of image metadata, I’d propose the coarse-grained organization be based on the highest metadata level contained within the image. For example, assuming the following represents the full metadata stack:

container->disk->LVM->FS

the top-level repository layout would contain sibling directories representing each metadata level:

  • Images/
  • containers/
    • QCOW/
    • VMDK/
    • Sparsed/
    • etc
  • Disks/
    • Files representing various disk/partition layouts.
  • LVM/
    • Files/directories representing various LVM configurations.
  • FileSystems/
    • ext3/
    • ext4/
    • ntfs/
    • xfs/
    • etc

Files under the containers directory would not include disk metadata - partition tables, etc. Files under the Disks directory, would not include LVM or filesystem data, but may include container metadata - QCOW2 for example. And so on.

So, while images under the FileSystems directory may also contain LVM, Disk, and container metadata, Their primary classification would be FileSystem. Tags could then be applied to more fully define the image:

ext3, LVM2, QCOW2, RHEL7

Here, even though the image file resides in the FileSystems/ext3 directory, we’ll also know that it’s from a RHEL7 distribution, is in a QCOW2 container, and is based on an LVM2 logical volume.


In addition to the top-level directories described above, I’d also include a directory for full VM images, which can include metadata files and multiple virtual disk images for each VM.

The repository structure would then look as follows:

  • Images/
  • containers/
    • QCOW/
    • VMDK/
    • Sparsed/
    • etc
  • Disks/
    • Files representing various disk/partition layouts.
  • LVM/
    • Files/directories representing various LVM configurations.
  • FileSystems/
    • ext3/
    • ext4/
    • ntfs/
    • xfs/
    • etc
  • VirtualMachines/
    • VMware/
    • RHEV/
    • SCVMM/
    • etc

Thoughts/comments?


#6

@rpo I like your idea. Sounds like a great way to organize all of the different types.


#7

@mmorsi is there any context/reference of what the Image Repo itself is about? reading all of this it seem to assume context of purpose of it.
thanks


#8

@iheim mostly to contain many various disk images for testing and other purposes.

@rpo @bdunne am fine w/ the new layout, not too hung up on any particular structure. As it stands my scripts will generate / cleanup the layout in a configurable manner, only one yaml file needs to be edited to change it however way we want it.

I don’t think it makes sense to store the maintenance scripts in the image repo itself though as it is code that could use the benefit of revision control. I think it’s best we have a public repo under the miq github we we can place this (a dummy yaml file could be pushed as a placeholder, the config containing sensitive server info would be excluded obviously)

Will update the scripts to generate the new layout and will continue on the image building / tagging mechanism of it after the systemd stuff currently on my plate


#9

Are you talking about just the yaml files, or the actual images themselves…keeping the images in git it probably a bad idea.


#10

@Fryguy was referring to the rake tasks / modules I wrote to setup the repo, archive it, generate images, + more. Currently it does employ a yaml based config to determine the content of the repo that should be created / maintained. Was looking for a git repo for these, but no the images themselves would not be stored in that