First stab at a Arch Linux lister.
Arch linux provides several way to discover packages but no easy way to get history of previous released version of a package.
After some discussion on Archlinux forum, https://bbs.archlinux.org/viewtopic.php?id=275574 I've gone the git repository way.
This lister fetch a git repository to list origins, parsing PKGBUILD files.
Arch Linux distribution is made of 'core', 'extra' and 'community' repository.
Core and extra packages listed in https://github.com/archlinux/svntogit-packages, and 'community' in https://github.com/archlinux/svntogit-community
For now it fetches only 'core' and 'extra' packages from the first repository (421.44 MiB at this time). I'll add the second one if we are ok with first implementation (1.58 GiB). Both of git repository have several commit a day.
PKGBUILD file are bash executable file. The common way for building a package is to use makepkg which has a internal PKGBUILD parser, https://gitlab.archlinux.org/pacman/pacman/blob/master/scripts/makepkg.sh.in
I did not found a PKGBUILD file parser in python in Pypi. There is one python module on github named 'parched' https://github.com/sebnow/parched
I written a naïve parser, but it's not solid yet to manage all special cases.
Example of some PKGBUILD i've found that can be really hard to parse:
* Firefox translations (xpi files), source and pkgname are dynamically generated, https://github.com/archlinux/svntogit-packages/blob/packages/firefox-i18n/repos/extra-any/PKGBUILD
* Licenses, a set of non executable files, https://github.com/archlinux/svntogit-packages/blob/master/licenses/repos/core-any/PKGBUILD
* Bash, Sometimes internal bash variable is ${myvar} sometimes $myvar, https://github.com/archlinux/svntogit-packages/blob/master/bash/repos/core-x86_64/PKGBUILD
* Libtool, Dvcs sources, https://github.com/archlinux/svntogit-packages/blob/master/libtool/repos/core-x86_64/PKGBUILD