Downloads cloud system images (basically qcow2 files) from web sites and keeps track of the images already downloaded and thus download only the latest ones. It is intended to be used periodically (ie: weekly)
This program is to be used by cloud system administrators that needs to get cloud images and put them into their cloud system in order for them to be available to their cloud users.
This program has been written to download the images from a machine that has access to the internet and provide them onto a shared file system to an another program that uploads them to the local cloud where internet is not available. It will not upload files directly to the local cloud.
-
Checksums (
checksums.rs) provides a structure and functions to deal with different checksums (for now only SHA256 and SHA512) -
Command line interface (
cli.rs) defines all options that a user can use from the command line. -
Settings reader and aggregator (
settings.rs) Reads configuration file and gives a structure that aggregates all settings from command line, configuration file, environment variable. -
Image history (
image_history.rs) manages a structure with the history of all files that were previously successfully downloaded. Loads and saves image history from an sqlite DB. -
File downloader (
download.rs) downloads the images, provides a summary of what has been downloaded or not and verifies downloaded files. Images can be saved with a "normalized" filename. Uses crate trauma. -
Image manager (
cloud_image.rs) manages cloud images that are to be found (from the websites), downloaded and verified -
Website manager (
website.rs) with the settings loaded, uses httpdirectory crate to get the latest images to be downloaded from a website. Multiples latest images can be downloaded from one website (ie:x86_64andaarch64images for instance) -
The program itself (
cid.rs) glues everything to enable downloading only latest verified images to a destination path.
Options a user can provide to the program. These option will the program's behavior:
- verbosity level,
- configuration file definition (allows other than default configuration files),
- database path (allows to change the default place where the database is stored),
- limit the maximum downloads at a time,
- enabling a file verification even if the file has been skipped from downloading (files exists in the path but doesn't exist in the database),
The command line will not collect parameters to download images from a specific web site.
Reads configuration file and aggregates it with parameters from the command line, variable environment. These settings may be:
- Web sites list
- Name of the web site
- Base url where to find the images
- List of versions (of the images to download)
- List of complementary url (one may want more than one architecture for instance)
- Image name filter is a regular expression to be able to find the image name we need
- Image name cleanse is a list of regular expressions that we need not to be in the image name (ie removes from the downlodable list all names that contains any of these regular expressions)
- Destination path where a downloaded image will be saved
- Normalize that is a template filename to save the downloaded files in a
"normalized" way (one can use
date,versionandafter_versionparameters to customize the filename's template - see cloud-image-download.toml)
Keeps names of successfully downloaded images in a database:
- The database should be created if it does not exist already,
- This module provides a function to tell if an image is already in the database (this means same name, checksum and date). A function to save metadata of a downloaded image to the database.
- When a downloaded image has been successfully verified (with its checksum) the image name with its date and checksum is saved into the database.
Downloads all files that are to be downloaded into it's final
destination configuration. Checksum files are downloaded into
memory, checksum is inserted into the CloudImage structure
and used to verify the downloaded file.
Deals with different checksum types SHA256 and SHA512 for now.
Checksum computation and verification is done in the image manager
module (verify() method of CloudImage)
Provides CloudImage structure, a method to verify the checksum
of an image and a method that will tell whether or not the file is
in the database (uses the method of image history module)
Uses the settings to scrape defined websites to try to get the latest cloud image to download among all the downloadable ones. When an url lists directories whose names are formatted as dates (ie: YYYYMMDD) or with numbers (ie: 41, 42, ...) the latest one is added to the search list instead of the original one.
base16ct: Used to encode binary checksums to human readable base 16 stringschrono: Date and time manipulation libraryclap: The defacto standard for command line parsing argumentclap-verbosity: Manages -v (--verbose) or -q (--quiet) options along with the log system.colored: To add some colors in log messagesconfig: Reading configuration from files and environment variablesconst_format: FormatsCID_USER_AGENTconst string at compile timedirectories: To get user directories from XDG specifications [1]env_logger: Loggerfutures: Enable iterating in parallel over vectorshttpdirectory: Parses HTML directories urls and returns a structured vector filled with directories and fileslog: Log systemregex: Regular expression enginereqwest: Interface to http requests helps retrieving web pagesrusqlite: Gives access to an sqlite database and is used to store the download historyserde: Serialization / Deserialization library used to read the configuration file into a dedicated structure.sha2: Widely use library to process sha checksums and checksums out there for cloud images are SHA256 or SHA512.shellexpand: expands paths filenames with~or variables such as${USER}or${HOME}.tokio: Used to spawn task to verify file's checksumtrauma: Downloads files and seems more maintained than downloader crate.
The idea here when something goes bad is to fail early as soon as possible.
