
I have to deal with some very long patches that have many sections that are of interest. What I need is a tool that allows basic regular expressions to search and which then gets every file that matches. IE if a patch changed files a.c, b.c, and c.c and I did a search for printf then I would want patches for the files which have changes related to printf and have the changes include all sections for each file in question. Failing that does anyone know of a tool to split a large patch into one separate patch for each file that is patched? EG a patch would be split into a.c.diff, b.c.diff, and c.c.diff? Once a patch was split like that I could use some shell code that involves grep and rm to get the patches I really want. A quick google search gets lots of hits for "patch" and "grep" which are not related to what I want to do. I might write a little Perl program to implement the latter, but the former would be a good thing to have. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On Sun, 10 Jun 2012, Russell Coker <russell@coker.com.au> wrote:
Failing that does anyone know of a tool to split a large patch into one separate patch for each file that is patched?
http://www.clearchain.com/blog/posts/splitting-a-patch The above URL has a little Ruby program to do this. It allows me to get this job done, but it would still be nice to have a patch grepping program. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 2012-06-10 13:45, Russell Coker wrote:
I have to deal with some very long patches that have many sections that are of interest. What I need is a tool that allows basic regular expressions to search and which then gets every file that matches. IE if a patch changed files a.c, b.c, and c.c and I did a search for printf then I would want patches for the files which have changes related to printf and have the changes include all sections for each file in question.
Failing that does anyone know of a tool to split a large patch into one separate patch for each file that is patched? EG a patch would be split into a.c.diff, b.c.diff, and c.c.diff? Once a patch was split like that I could use some shell code that involves grep and rm to get the patches I really want.
A quick google search gets lots of hits for "patch" and "grep" which are not related to what I want to do. I might write a little Perl program to implement the latter, but the former would be a good thing to have.
If I had to do something like this I think I'd create a git repository of the original codebase, make a commit, apply patch, make another commit, and then use git to search for changes. I don't know if this is suitable for you, but it's the easiest thing I can think of. -- Regards, Matthew Cengia

On Sun, 10 Jun 2012, Matthew Cengia <mattcen@gmail.com> wrote:
If I had to do something like this I think I'd create a git repository of the original codebase, make a commit, apply patch, make another commit, and then use git to search for changes. I don't know if this is suitable for you, but it's the easiest thing I can think of.
How would you do that exactly? The git grep command gives you a list of files matching a regex. But I could get that result by just applying the patch normally and running "grep -R". The problem is not just getting a list of changed files, it's extracting the patches for those files. In this case I'm working on the SE Linux policy patch for Fedora 17 which is 153,202 lines and 1335 files. Then after finding the files that are of interest I had to separate out the ~100 that were of note from the rest which isn't a trivial matter when you have a 153,202 line text file! Thanks Rodney for the reference to csplit. That is much more convenient than downloading a random Ruby file from the net. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On Sunday 10 June 2012 16:16:39 Russell Coker wrote:
How would you do that exactly?
git log -p /path/to/file That will show you all the changes to that particular file. -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC This email may come with a PGP signature as a file. Do not panic. For more info see: http://en.wikipedia.org/wiki/OpenPGP

Chris Samuel <chris@csamuel.org> wrote:
On Sunday 10 June 2012 16:16:39 Russell Coker wrote:
How would you do that exactly?
git log -p /path/to/file
That will show you all the changes to that particular file.
This is what I would use, but it's also worth noting that git diff can show you the differences between two commits that apply to a specified file (if I'm reading the manual page correctly).

Re your original question: have you checked patchutils? I haven't used it much, but grepdiff and splitdiff sound promising... Russell Coker wrote:
The problem is not just getting a list of changed files, it's extracting the patches for those files. In this case I'm working on the SE Linux policy patch for Fedora 17 which is 153,202 lines and 1335 files.
BTW, in recent years Debian addressed this by using quilt (from SuSE) and dpkg source format 3.0 [0] such that instead of foo_1.0.orig.tar.gz foo_1.0-1.diff.gz You have foo_1.0.orig.tar.gz foo_1.0-1.debian.tar.gz Where the latter is a tarball of the debian/ subdir and patches to the upstream tree are in logical chunks like debian/patches/017-hyphen-used-as-minus-sign Each such patch is also strongly encouraged to have DEP-3[1] metadata. Maybe Fedora also have somthing like that, and SE Linux policy is one of those packages that hasn't started using it yet. If not, maybe you should suggest such an approach to them! :-) PS: quilt and debian/source/format 3.0 are orthogonal, you can use one without the other. [0] documented in dpkg-source(1) manpage. [1] http://dep.debian.net/deps/dep3/

On Sun, 2012-06-10 at 13:45 +1000, Russell Coker wrote:
I have to deal with some very long patches that have many sections that are of interest. What I need is a tool that allows basic regular expressions to search and which then gets every file that matches. IE if a patch changed files a.c, b.c, and c.c and I did a search for printf then I would want patches for the files which have changes related to printf and have the changes include all sections for each file in question.
Failing that does anyone know of a tool to split a large patch into one separate patch for each file that is patched? EG a patch would be split into a.c.diff, b.c.diff, and c.c.diff? Once a patch was split like that I could use some shell code that involves grep and rm to get the patches I really want.
csplit the-patch-file '/^--- /' '{*}' should split up the patches for you, leaving an empty xx00 and the patches in xx* thereafter. You may want to further tighten the pattern though.

On 10 June 2012 13:45, Russell Coker <russell@coker.com.au> wrote:
I have to deal with some very long patches that have many sections that are of interest. What I need is a tool that allows basic regular expressions to search and which then gets every file that matches. IE if a patch changed files a.c, b.c, and c.c and I did a search for printf then I would want patches for the files which have changes related to printf and have the changes include all sections for each file in question.
Not sure if any of these will help your immediate problem (filterdiff?), however they are worth knowing about regardless. Package: patchutils Status: install ok installed Multi-Arch: foreign Priority: optional Section: text Installed-Size: 223 Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com> Architecture: amd64 Version: 0.3.2-1.1 Depends: libc6 (>= 2.4), perl, patch, debianutils (>= 1.16) Description: Utilities to work with patches This package includes the following utilities: - combinediff creates a cumulative patch from two incremental patches - dehtmldiff extracts a diff from an HTML page - filterdiff extracts or excludes diffs from a diff file - fixcvsdiff fixes diff files created by CVS that "patch" mis-interprets - flipdiff exchanges the order of two patches - grepdiff shows which files are modified by a patch matching a regex - interdiff shows differences between two unified diff files - lsdiff shows which files are modified by a patch - recountdiff recomputes counts and offsets in unified context diffs - rediff and editdiff fix offsets and counts of a hand-edited diff - splitdiff separates out incremental patches - unwrapdiff demangles patches that have been word-wrapped Original-Maintainer: Christoph Berg <myon@debian.org> Homepage: http://cyberelk.net/tim/patchutils/index.html -- Brian May <brian@microcomaustralia.com.au>
participants (7)
-
Brian May
-
Chris Samuel
-
Jason White
-
Matthew Cengia
-
Rodney Brown
-
Russell Coker
-
Trent W. Buck