summaryrefslogtreecommitdiff
blob: 63e5ce35ae1f3bd287a9ee68ea9441c43fa307d8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
GLEP: 64
Title: Export PMS's cached VDB information
Author: Anthony G. Basile <blueness@gentoo.org>
Type: Standards Track
Status: Final
Version: 1
Created: 2014-07-31
Last-Modified: 2017-10-12
Post-History: 2014-08-30
Content-Type: text/x-rst
---

Abstract
========

During build time, important information is generated by the package
manager (PM) about the package(s) being built. When Portage is used
as the PM, this information is cached on a per package basis in directories
under /var/db/pkg/<cat>/<pkg> (VDB).  While this information can be
regenerated on the fly, doing so may be expensive or impractical. Examples
of such information include a complete list of all files belonging to
a particular installed package or the dynamical linking information about
a package's executable and/or shared objects. To avoid the unnecessary cost of
regenerating, and to facilitate interoperability between all PM's and other
tools that could use this information, all PM's should cache a standard set
of information and provide a common API for exporting it. In this GLEP, we
specify what information should be cached and exported.
[#COUNCIL-RATIFICATION]_

Motivation
==========

Information generated by the PM at build time spans the spectrum from easy to
difficult to regenerate.  Some information, like a package's HOMEPAGE may be
trivially regenerated by simply grepping the package's ebuild in portage tree.
Despite this ease, however, even this information needs to be cached in case
the ebuild is removed from the tree, but the package is still installed on the
system.  But even if the installed package and the ebuild in the tree are not
"out of sync", there is yet another reason to cache information generated by
the PM at build time.  Some information, like the list of all installed files
belonging to a particular package, cannot be trivially regenerated.  If such
a list were not cached, the PM would have to rebuild the package in order to
regenerate it, and even then this regenerated list is not guaranteed to
represent the actual state of the installed package because of possible
changes in the environment of the rest of the system between builds.  Apart
from the fact that the PM itself needs this list when uninstalling, and so
much cache it for itself, listing a package's files is useful for other
utilities.  For example, at the time of this writing, sys-apps/elfix,
app-portage/gentoolkit, app-portage/portage-utils and app-portage/eix, are
some examples of utilities that make use of portage's VDB to obtain this
cached list.

Another example of information which is usefeul and expensive to regenerate,
but perhaps less obvious than the previous example, is linking information
such as that reported by running ``readelf`` or ``scanelf`` on ELF objects, or
similar utilities for other executable formats like Mach-O or COFF.  On a
"rolling release" such as Gentoo, tracing forward and reverse dependencies
between executable objects and their libraries is critical to avoid breakage
during upgrade. The need to trace these dependencies is evident in PMS
features like sub-slotting which aim to make sure that executables are always
consistently built against libraries: upgrading a library which breaks
backwards compatibility automatically triggers rebuilding of its dependent
executable(s) [#PMS-SPEC]_ [#SUBSLOTS]_. While sufficient in their own scope,
these PMS features have limitations: 1) this information is calculated to
ensure consistency at build time, but is not cached and exported afterwards
for use by other tools, such as ``revdep-pax`` which uses the same information
to consistently apply PaX markings between executables objects and libraries
[#REVDEP-PAX]_; and, 2) such information is not sufficiently fine grained for
tools which require discrimination on the basis of ABI, SONAME, library path
name etc. By caching and exporting this formation, an entire "linkage graph"
of executables objects and libraries on a system can be constructed
[#LINKAGE-GRAPH]_ to facilitate quick traversal of both forwards and
backwards dependencies. Questions like "what are the path names of all the
executables on this system which link against libssl.so.1.0.0 for ABI=x32?"
can be quickly answered without having to reread the dynamic section of every
object on the system in a search for those which are x32 and need libssl.so.

The above examples motivate us to created a uniform standard for any utility
that would like to make use of this generated information.  Below, we specify
a standard minimum set of information that should be generated by any PM at
build time, cached and then exported by an common API.

Specifications
==============

For each package installed, the following information should be generated
at build time, cached, and later exported:

* All portage variables as specified as part of the Metadata Cache as defined
  in PMS 13.2 [#METADATA-CACHE]_. Note that, as with the Metadata Cache, these
  variable should be stored with all the conditionals evaluated.

* A list of all files belonging to the package, along with a designation of
  the file type (regular, directory, symlink, pipe, etc), MD5SUM or other
  checksum, and mtime time.

* A list of all executable or shared objects for each package and the
  corresponding linking information, including full path to the object, its
  architecture and ABI, SONAME, RPATH and any NEEDED objects they link
  against, as reported by ``readelf`` on ELF systems, or similar tools for
  other executable formats.  Currently this information is being cached by
  Portage in NEEDED.ELF.2, NEEDED.MACHO.3, NEEDED.XCOFF, NEEDED.PECOFF, etc.

* Flags affecting the package's build system behavior, including at least
  CHOST, CBUILD, CTARGET, CFLAGS, CXXFLAGS, CPPFLAGS, and LDFLAGS.  In case a
  fortran compiler is used, FFLAGS should also be included.  These may be
  empty in the case of packages where compiling/linking is unnecessary.

* Flags affecting the PM's behavior which are not already specified
  in PMS 13.2, including at least USE and KEYWORDS.

* Dependency between packages calculated by the PM, including at least DEPEND,
  RDEPEND, and PDEPEND.

* Miscellaneous information including the time the packages was built, the
  repository name, DEFINED_PHASES, EAPI, INHERITED eclasses and SLOT.

Implementation notes
====================

It is not the purpose of this GLEP to specify the details of a common API for
exporting the above information.  Even less so is it our purpose to delineate
the implemenatation details for each PM.  However, a common API for exporting
the above information should be developed and specified by the PM teams and be
included in future PMS documentation.  Any changes to API should be versioned
to allow for consistency as it develops over time.

As a guide, we recommend a plain CLI API which answers questions as follows:

What is the SLOT number of a particular version of webkit-gtk?
  ::

      query-installed metadata =net-libs/webkit-gtk-2.4.4-r200 SLOT

What is the ABI and of a particular file and the libraries it links against?
  ::

      query-installed file /usr/bin/timeout ABI NEEDED

Backwards compatibility
=======================

1. Portage has cached all the above information since v2.2_pre7 2008-05-21;
   however, it is not exported via a consistent API. Versions of portage with
   the above specified API implemented can make use of caches built as far
   back as 2008.

2. For PM's that do not cache any of above, a migration scheme should be
   implemented to generate the cache without having to rebuild world.

References
==========

.. [#COUNCIL-RATIFICATION] This has been ratified by the Council. See
   http://www.gentoo.org/proj/en/council/meeting-logs/20130910-summary.txt

.. [#PMS-SPEC] This is specified in PMS. See
   http://dev.gentoo.org/~zmedico/portage/doc/portage.html#package-ebuild-eapi-4-slot-abi-metadata-slot-sub-slot-abi

.. [#SUBSLOTS] Sub-slots_and_Slot-Operators
   https://wiki.gentoo.org/wiki/Sub-slots_and_Slot-Operators

.. [#REVDEP-PAX] http://git.overlays.gentoo.org/gitweb/?p=proj/elfix.git;a=blob;f=scripts/revdep-pax
   The man page can be viewed at http://www.linuxhowtos.org/manpages/1/revdep-pax.htm

.. [#LINKAGE-GRAPH] An example of such a class is at
   http://git.overlays.gentoo.org/gitweb/?p=proj/elfix.git;a=blob;f=pocs/link-maps/link_map.py.
   Portage itself constructs such a graph internally when evaluating emerge
   @preserved-rebuild.

.. [#METADATA-CACHE] https://projects.gentoo.org/pms/6/pms.html#x1-16300013

Copyright
=========

This work is licensed under the Creative Commons Attribution-ShareAlike 3.0
Unported License.  To view a copy of this license, visit
https://creativecommons.org/licenses/by-sa/3.0/.