Managing configuration using etc-update (or any similar tool) is a nightmare for any complex system. This proposal tries to come up with a better solution. This is very rough draft at this point. I'll appreciate any comments. Well, almost any. Send them to ilya@theilya.com. Put [config-system] in the subject. Requirememts 1. Single desktop scenario should be no more of a burden on end-user then it is now. 2. It should be easy to see history of configuration file changes 3. It should be easy to roll back 4. It should be easy to manage common configuration between multiple machines Proposal Use SCM with proper branching and merging capabilities as back end for storing configuration. I will use git as an example, since it has very good support for such things. Subversion, at the time of this writing seems to be incapable of doing it well enough. Other SCMs might work though. Requirements for an SCM are - Atomic commits - Proper tracking of branch merge states - Ability to push commits between repositories - Ability to track sync state of multiple repositories Scenarios: 1. Single user, single desktop 2. Organization with groups of similar desktops (think beancounter machines, developers' machines, etc.) 3. Few HA or HPC clusters Primary difference between scenarios 2 and 3 is that machines in a cluster get upgraded/reconfigured almost simulteniously, while in case #2 each individual machine can be upgraded at any time. Presumptions: If the same version of same package is compiled on two different machines with identical package manager configuration, all config files generated by installing such package will be identical. "Machine group" - set of machines which share master config tree, will have identical package manager configuration. Questions: 1. Should we stick with CONFIG_PROTECT concept, or make ebuilds specify each config file explicitly? A: No, drop that. Package manager owned files can be defined as configuration by ebuild authors. Package manager will provide sane defaults (like /etc) to be config-managed, but ebuild can remove from them. Non package manager owned files can be marked for config management by user. Optionally, user may associate custom config files with specific atoms, so that they get a reminder when corresponding packages are updated or removed. 2. What is the best way to track ownership and permissions of config files A: package manager will provide it to the config system, and internally we could just keep a file in mkdevs format somewhere. 3. What is the best way to recover from inconsistencies between config files generated by installs of the same package? 4. How do we handle reinstalls and use flag changes? 5. What other factors can legitimately affect generated config files? 6. What information should package manager itself be aware of? (E.g. what paths do we record?) Config repository organization. Trees: master tree - one where package manager keeps all the changes. It is never directly modified. May be non-local. secondary tree - local copy of master tree. Optional. Can be pulled from, but never pushed into. Updated by pulling from master tree. Can be promoted to become master. root tree - one checked out into /. End user may modify files there. temp tree - one where all the work is done. It is clone of the master tree, created to perform some action, and then destroyed. There will be following branches: - pkgmgr - branch where all the config files are generated by package manager - "group" - branches which have all the changes, common to some group of machines - "machine" - branch where the final machine-specific changes are made Steps during package management ((re)install, upgrade, remove): 0. package manager checks that root tree has no changes which weren't pushed to the master tree. 1. package manager saves list of config files from previously installed version 2. package manager creates a tree containing new configuration files 3. Package manager lets configurator know which packages have been changed (reinstalled, upgraded, or removed) 3. configurator client is called by end user. 4. pkgmgr branch is checked out into the temp tree 5. it iterates over changed packages, skipping packages, if a tag with corresponding package version and options is present, committing config changes to the pkgmgr branch, and tagging it with version info. 6. pkgmgr branch is merged into group branch(es), and then into the branch for local machine. All merges get tagged with installed package state 7. All changes are pushed back into the master tree 8. root tree pulls from the master. Special case for a package downgrade: If package downgrade is performed, and configurator sees a tag in the pkgmgr branch with newly installed version and options, user is offered to roll back corresponding config files to the version on the machine branch right before the update to next version of same package. Steps for making changes to configuration: 1. The user makes modifications to config files 2. The user runs configurator in commit mode 3. Configurator checks out next level of branch into the temp tree. 4. Configurator checks if user has a next level group branch configured. If not, skip to step 7. 5. Configurator interates over changes in the root tree, prompting user to decide which of them belong to the group. Each change selected as group is applied to the temp tree. Once done, temp tree changes are committed. 6. Configurator merges changes from current branch branch into the next branch. 7. Configurator applies remaining changes to the temp tree. 8. Configurator commits the temp tree, and pushes it to master tree. 9. Configurator resets root tree, then pulls master tree. Goals vs. the proposal If we make it possible to configure package manager to run configurator in commit mode automatically before merging any new packages, single desktop case has no extra administrative burden compared to etc-update way. At the same time goot UI tools for examining changes at all levels are available, which can make end-user's life lot easier. This satisfies goal 1. Any git history browsing tool allows to navigate root tree. This satisfies goal 2. Group branches make management of multiple machines easy. Goal 4. ------------------------------------------------------------- Random thoughts. I am not sure what is the use case for more then one level of group branches. Or even for more then one group branch per master tree. Cases I can imagine (multiple clusters with identical packages but different /etc/passwd? Sure, but what for?) are not likely to be happeining in real life. OTOH, it doesn't add any complexity, neither to configurator logic nor to the end user, so let it be there. We need means of distributing information about which tree is the master between machines in a group. It would be nice to provide a way for new package to be installed into / with an already merged configuration files. This way service interuption is minimal, especially when config file formats or locations change considerably. It would be nice to have a way to record config file location changes between package versions. Implementation described above doesn't let us do it. Probably ebuilds would have to supply the information to make it possible. It might make sense to do group->machine merge for all available machine branches, and then just pull from master to root on each machine after upgrade, if installed package state tag is present.