Priit Laes April 08, 2010 draft-project-grumpy-gsoc-3 Project Grumpy ============== Project Objective +++++++++++++++++ There are many moments in every package maintainers life when one wishes that one or another thing would be done automatically for him/her: * Check which packages have identified common QA issues. * Generate a stabilization list for the selection of packages. * Get notified of packages that have new upstream versions. * Get notifications of packages that can be stabilized if following the 30-day guideline. Many such automated or semi-automated applications/scripts do exist, but they are currently dispersed across the Internet in various different locations, with typically no good connection between packages and the maintainer looking for the information. These applications include tinderbox rindex/dindex reports, gentoo-bumpchecker, manual repoman/pcheck runs, and so on. Project "Grumpy" is intended as a Gentoo Linux project to aggregate the functionality of all these tools into one centralized application. Abstract ++++++++ Project Grumpy is a set of applications for gathering, indexing and interacting with various ebuild- and developer-related metadata. Grumpy Component Overview (aka deliverables) ++++++++++++++++++++++++++++++++++++++++++++ This section gives an overview about the components and technologies that are going to be used for this project. Grumpy Application Backend -------------------------- Grumpy Application backend is the core of the Grumpy Application. Backend handles data storage and indexing and consists of following components: * Database storage for ebuild metadata * Tools for gathering and managing metadata * Portage indexer * Upstream information checks (version bumps, issues, etc) * User-interface tools (Web interface, commandline utilities) Database ~~~~~~~~ The aim is to use document-oriented storage system (MongoDB) that allows easy storing and retrieval of metadata in JSON-like data schema. MongoDB stands in a gap between key-value stores (Memcached) and traditional RDBMS (PostgreSQL, MySQL) systems. MongoDB also has facilities for advanced data aggregation options like Map/reduce, replication and fail-over support and auto-sharding, if ever needed. Tools for metadata management ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are basically three types of tools: * Firstly, the tools that deal with low-level operations like keeping portage contents in synchronization with database. For this part it is not yet clear whether it is possible to use already existing software (Portage API or pkgcore) or should it be implemented from scratch. * Secondly the tools that are used to query outside information for ebuild related information (upstream version bumps, bugzilla status, tinderbox results). Implementing tools for this part also requires working together with various parties in order to make sure we always get the up-to-date data in a format that can be easily understood by our tools. * Thirdly, utilities that allow users (Gentoo developers) to maintain various kinds of information they are interested in. For this purpose there are mainly two types of utilities in mind: Web application providing both HTML-based interface and JSON API. The latter can be used also for various command-line utilities. Timeline and Development Plan +++++++++++++++++++++++++++++ It is quite clear that the most crucial part in this project is the data storage and portage indexer. When it is clear that contents in the database can be kept in synchronization with Portage (this also includes package moves, slotting changes) then works on other parts like upstream indexers and web application can be started. Therefore I propose following tentative timeline: 24. May: Official start of project * Implement portage synchronization with database * Implement 30-day stabilization checker * Implement upstream version checker for GNOME project 12. July: Mid-term evaluation submitting starts * Inquiries on whether it's possible to use LDAP authentication for web app 16. July: Deadline for mid-term student evaluations * First sketches for JSON-API via web application * Few simple commandline utils for developers to manage packages of interest 9. August: Start of 'pencils down' 20. August: Final evaluation deadline Biography +++++++++ I am an undergraduate student of theoretical physics in University of Tartu and my main research interest is cosmology and the nature of gravity. My leisure time mostly consists of working on various open-source or freelance projects, reading (either about physics or science-fiction) and spending time with friends. I have also held various positions in the past, including system administrator, embedded software developer and web application developer. FUN: Origin of Grumpy's name ++++++++++++++++++++++++++++ This is an excerpt from #gentoo-desktop channel on June 11. 2009:: 21:58 < leio> ok, I need a good codename for this maintainer website thing where you would be able to look things up, like what to bump, etc. Go! 21:58 < plaes> grumpy ? :) 21:58 < plaes> grumpy.gentoo.org ? :) 21:59 < scarabeus> glocate 21:59 < scarabeus> ;] 21:59 < EvaSDK> grumpy++ 22:00 < EvaSDK> that would be awesome 22:00 < scarabeus> but i agree with grumpy too 22:00 < scarabeus> :} 22:00 < scarabeus> sounds cool :]