Patches per src rpm in fedora

April 21, 2009

Luis asked me a question about how to get some info out of koji. We ended up
getting the same kind of info out of the source rpm repodata, instead. It
wasn’t hard with repoquery but once I started getting it, I wanted to find
out how we’ve been doing for a while.

In fedora, we try to encourage folks to push their patches upstream and not
carry them locally. So I put together a little script to see how many
patches we’re carrying per source rpm, on average.

The script is simple, obviously:

#!/bin/bash

srcrepo=$2
repoid=$1

patches=`repoquery –quiet –repofrompath=$repoid,$srcrepo –repoid=$repoid \
–archlist=src -l -a | grep patch | wc -l`
pkgs=`repoquery –quiet –repofrompath=$repoid,$srcrepo –repoid=$repoid \
–archlist=src -a |  wc -l`

result=`echo “$patches/$pkgs” | bc -l`
echo $repoid has $patches patches in $pkgs src rpms
echo $repoid has $patches in $pkgs src rpms

Here are the locations I used:
f7:

http://archive.fedoraproject.org/pub/archive/fedora/linux/releases/7/Everything/source/SRPMS/

f8:

http://archive.fedoraproject.org/pub/archive/fedora/linux/releases/8/Everything/source/SRPMS/

f9:

http://download.fedora.redhat.com/pub/fedora/linux/releases/9/Everything/source/SRPMS/

f10:

http://download.fedora.redhat.com/pub/fedora/linux/releases/10/Everything/source/SRPMS/

rawhide:

http://download.fedora.redhat.com/pub/fedora/linux/development/source/SRPMS/

And here are the results:

f7 has 7475 patches in 4226 src rpms
1.76881211547562707051 patches per srpm

f8 has 7991 patches in 4834 src rpms
1.65308233347124534546 patches per srpm

f9 has 9151 patches in 5547 src rpms
1.64972056967730304669 patches per srpm

f10 has 9386 patches in 6406 src rpms
1.46518888541991882610 patches per srpm

rawhide patches has 10278 in 7444 src rpms
1.38070929607737775389 patches per srpm

So we’ve been improving from release to release. Not a lot of improvement
between f8 and f9 but still some. All in all, I’m pretty happy about that.

As a contrast I also ran it on centos4.7 and centos5.3. Now, these two
distros have to carry a lot of patches in order to maintain backward compat
with their original versions:

centos 4.7 has 8446 patches in 876 src rpms
9.64155251141552511415 patches per srpm

centos 5.3 has 10238 patches in 1186 src rpms
8.63237774030354131534 patches per srpm

That looks about right, I think.

Maybe this is useful trivia, maybe not, but it was interesting nevertheless.

Update:  Josh Boyer pointed out that my script didn’t pick up patches named *.diff so I changed the script and here are the results:

f7 has 7648 patches in 4226 src rpms
1.80974917179365830572 patches per srpm

f8 has 8191 patches in 4834 src rpms
1.69445593711212246586 patches per srpm

f9 has 9350 patches in 5547 src rpms
1.68559581755904092302 patches per srpm

f10 has 9595 patches in 6406 src rpms
1.49781454886044333437 patches per srpm

rawhide-src has 10548 patches in 7444 src rpms
1.41698011821601289629 patches per srpm

same ballpark afaict.

4 Responses to “Patches per src rpm in fedora”

  1. syskill Says:

    I suspected that the kernel would be an outlier, and a modified version of your script seems to confirm that (especially in CentOS):

    f10 has 9229 patches in 6407 src rpms
    1.44045575152177306071 patches per srpm
    f10 kernel srpm has 105 patches
    without the kernel, 1.42428972837964408367 patches per srpm

    el5 has 10229 patches in 1187 src rpms
    8.61752316764953664700 patches per srpm
    el5 kernel srpm has 2781 patches
    without the kernel, 6.27993254637436762225 patches per srpm

  2. foo Says:

    There are some folks who are particularly bad at complying with the “Fedora does things upstream” rhetoric unfortunately. I suggest you develop some way to determine which patches are forwarded upstream and use that to look at the metric that really matters.

  3. Luis Says:

    For what it is worth, I wasn’t asking for it as a measure of ‘are people good/bad at getting things upstream’ but more as ‘is it in fact true that in most cases what distro users are getting is not pristine source’? The answer would seem to be ‘close to pristine but in the average case definitely not pristine’.

    (If you wanted to do this seriously someone pointed out you’d want to measure variance, but for my informal research these numbers are excellent- thanks again, Seth.)

  4. James Antill Says:

    Also you really need to measure the lines of code in the patches, but this is more work :(.

    For instance a single patch in F10 yum to change the default for an option is now counted the same as the rawhide yum patch “rebase to stable HEAD”.

    I’d also argue that it’s getting pretty common to use “git format-patch” approaches, so one fix might actually be multiple patches. So 1-2 patches on average is much more likely to be “pristine, but with a single (build) fix”.


Leave a comment