[GTALUG] Calling all networking and SVN gurus

Giles Orr gilesorr at gmail.com
Tue Jun 11 22:42:07 EDT 2019


On Fri, 7 Jun 2019 at 13:16, Giles Orr <gilesorr at gmail.com> wrote:

> To forestall the inevitable suggestion: no, the solution is not to move to
> git.  At least not yet: for various reasons, it can't happen right now.
> This is the last holdout, all our other repos are already git.
>
> I apologize for the length of this dissertation: I've been doing my
> homework as fast as I can, and I want to provide complete information.
>
> I've just moved a previously semi-local (different building but "on
> campus") SVN server to a cloud instance (running Debian, SVN 1.9.5, and
> Apache 2.4.25, access via https://).  For the most part it went smoothly,
> but now our "on campus" Jenkins server intermittently loses the network
> connection on 'svn up' so the deploy fails.  That's bad.  What's far worse
> is that while Jenkins initially resumed these broken updates politely when
> the deployment was re-run, it's now decided that these resumes have left a
> locked workspace and it has to do a fresh checkout.  One of the problems
> with moving this repository to git is that SVN trunk is ~5G in size: a
> checkout to a local client takes 5-10 minutes, but for some reason a
> Jenkins checkout takes about 30 minutes (this is all pretty new to me, and
> I haven't had time to investigate that time difference yet).  An update
> takes a 60 seconds, but a new checkout takes 30 minutes - you can see where
> that causes delays in deployment.
>
> One possible mitigation is to trap the SVN failure, do a clean-up on the
> directory and re-run.  I may have to try this, but ... that's just
> mitigation, not a solution.
>
> The Jenkins server is on Windows (I wasn't given a choice) and mostly
> works well.  It uses Cygwin for all the SVN stuff (SVN version 1.11.x).
> It's also at a different physical location from me with different network
> rules.
>
> The critical lines of the failure error:
>
>     org.tmatesoft.svn.core.SVNException: svn: E175002: Connection reset
>
>     svn: E175002: REPORT request failed on '/svn/repo/!svn/vcc/default'
>
>
> (It being Java, the errors run to 40 or 50 lines: I think this is the only
> part that's important.)  Unfortunately, this is one of those errors that
> Google searches produce lots of questions, lots of speculations ... and no
> solid answers.  At least not that I've found.  Likewise, a lot of people
> want to know, as I did, about the relatively unusual filepath
> ("!svn/vcc/default") but I've never seen a solid answer as to what that's
> about either.  The logs show Jenkins requests against that filepath with
> both REPORT and PROPFIND, but Jenkins is only failing on REPORT.  Both of
> these request types are WebDAV extensions.
>
> Our staff don't seem to be having any trouble checking out or updating the
> repository across a mix of Windows and Mac clients.
>
> I've so far failed at getting more logging out of SVN and Apache: what I
> do have doesn't tell me much useful, at least not related to these failures.
>
> This problem is intermittent and infrequent.  I'm thinking the next step
> is network sniffing - although I'm hoping someone can suggest something
> better.  I'm relatively inexperienced with Wireshark and tcpdump (and SVN
> ...), but what experience I do have suggests all I'm going to get is to
> learn that SVN stopped providing data without finding out why or how to fix
> it.
>
> Any suggestions welcomed, thanks.
>

On Monday morning we had a catastrophic failure of Jenkins (incurred,
inevitably, by a software upgrade performed by yours truly).  I've been
firefighting ever since.  I hope to follow up on the several excellent
suggestions given here once that particular fire is under control.  Thanks
all.

-- 
Giles
https://www.gilesorr.com/
gilesorr at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/talk/attachments/20190611/1e67c970/attachment.html>


More information about the talk mailing list