[JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

Baptiste MATHUS
Hello everyone,

For Jenkins Essentials, one critical requirement is to be able to upgrade, and hence rollback in an automated manner.
So, as we are committed to an open design process, I have written a first draft of the associated Jenkins Enhancement Proposal.

It is up for review at https://github.com/batmat/jep/pull/1

I am very eager for any kind of feedback there.
I am especially interested in catching & clarifying (more or less) glaring holes in that design. 

Though I did some tests locally to check everything was not obviously flawed from the beginning, we do not have a prototype ready yet, but hope to have something around the end of March.

Thanks everyone!

-- Baptiste

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS7gojzJCpKZskJVz4e0toS_pAUvintduNRiPEog9532QA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

R. Tyler Croy
(replies inline)

On Wed, 14 Mar 2018, Baptiste Mathus wrote:

> Hello everyone,
>
> For Jenkins Essentials
> <https://github.com/jenkinsci/jep/tree/master/jep/300>, one critical
> requirement is to be able to upgrade, and hence rollback in an automated
> manner.
> So, as we are committed to an open design
> <https://github.com/jenkins-infra/evergreen#open-design> process, I have
> written a first draft of the associated Jenkins Enhancement Proposal.
>
> It is up for review at https://github.com/batmat/jep/pull/1
>
> I am very eager for any kind of feedback there.
> I am especially interested in catching & clarifying (more or less) glaring
> holes in that design.

Thanks for taking the time to send this out Ba(p)tiste! Now that I've had a
chance to take a look, I think the one thing that's missing from this document
is a bit more explanation of the problem which requires this solution.

My take on this problem space is that core and plugin upgrades can result in
modification of config.xml and other object-serialized-files on disk when an
upgrade occurs. As these files are serialized from objects in memory, when an
internal API changes within a plugin/core, it will necessarily result in
changes to files on disk. These changes may not be safe to "rollback" from,
i.e. Plugin A v0 cannot load a file generated by Plugin A v1.

This means an upgrade of Jenkins Essentials has a very real potential to cause
irreversible modifications to files on disk which prevent a safe rollback.


So that type background/context is (IMHO) missing a bit from the JEP document.

I think the Motivation section should also explain a bit more explicitly that
"bricking" a Jenkins Essentials instance is a severe failure for the project,
and thus we need to prevent against irreversible modifications to files causing
runtime failures for the Jenkins Essentials installation.


Overall, I think this looks quite reasonable. I look forward to seeing the
implementation and tests we get to write to support it :)


Cheers
- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>
     xmpp: [hidden email]

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/20180314175527.yeofyld6a2d2p4ro%40blackberry.coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

Jesse Glick-4
On Wed, Mar 14, 2018 at 1:55 PM, R. Tyler Croy <[hidden email]> wrote:
> core and plugin upgrades can result in
> modification of config.xml and other object-serialized-files on disk when an
> upgrade occurs.

Does happen, but rarely. In most cases, format changes take effect on
disk only when a `Saveable` object is in fact saved for some other
reason—a *Save* button in the UI, for example.

> This means an upgrade of Jenkins Essentials has a very real potential to cause
> irreversible modifications to files on disk which prevent a safe rollback.

This is true.

> "bricking" a Jenkins Essentials instance is a severe failure for the project

This is what needs to be defined much more carefully. What would cause
an installation to be “bricked”, exactly? Years of work by core devs
(see JIRA issues with label `robustness`) have solved most cases where
Jenkins would fail to start or be used in a basic capacity merely due
to unreadable configuration files. You might get *Discard Old Data*
warnings, of course, but these are not fatal.

> we need to prevent against irreversible modifications to files causing
> runtime failures

That is a much broader requirement, at least if “runtime failures”
could be interpreted as things like “the deployment stage in all my
pipelines started failing”, and it is not clear to me that the
proposal as it stands comes close to satisfying it.

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr2-jQ7HgKteHX%3DvyqPrCEHATD-q2QJwqE8ggJOYM%3D6Bcg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

Baptiste MATHUS
Hello everyone,

Sorry for the time it took to get back here. I think I finally addressed all comments.

https://github.com/batmat/jep/pull/1 is ready for another round of comments.

I hope that no big thing surfaces again, though obviously there will be issues discovered later, but I feel like we have been thinking about it enough to be able to move forward.

Thanks a lot.

2018-03-16 16:10 GMT+01:00 Jesse Glick <[hidden email]>:
On Wed, Mar 14, 2018 at 1:55 PM, R. Tyler Croy <[hidden email]> wrote:
> core and plugin upgrades can result in
> modification of config.xml and other object-serialized-files on disk when an
> upgrade occurs.

Does happen, but rarely. In most cases, format changes take effect on
disk only when a `Saveable` object is in fact saved for some other
reason—a *Save* button in the UI, for example.

> This means an upgrade of Jenkins Essentials has a very real potential to cause
> irreversible modifications to files on disk which prevent a safe rollback.

This is true.

> "bricking" a Jenkins Essentials instance is a severe failure for the project

This is what needs to be defined much more carefully. What would cause
an installation to be “bricked”, exactly? Years of work by core devs
(see JIRA issues with label `robustness`) have solved most cases where
Jenkins would fail to start or be used in a basic capacity merely due
to unreadable configuration files. You might get *Discard Old Data*
warnings, of course, but these are not fatal.

> we need to prevent against irreversible modifications to files causing
> runtime failures

That is a much broader requirement, at least if “runtime failures”
could be interpreted as things like “the deployment stage in all my
pipelines started failing”, and it is not clear to me that the
proposal as it stands comes close to satisfying it.

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr2-jQ7HgKteHX%3DvyqPrCEHATD-q2QJwqE8ggJOYM%3D6Bcg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS4M3Eaqu0SqrfYY_UKq9CW-X7bmsaxYf%3D03msBS0aJLvg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

Baptiste MATHUS
FYI JEP now officially filed for review at https://github.com/jenkinsci/jep/pull/67

Thank you everyone!

2018-03-20 23:21 GMT+01:00 Baptiste Mathus <[hidden email]>:
Hello everyone,

Sorry for the time it took to get back here. I think I finally addressed all comments.

https://github.com/batmat/jep/pull/1 is ready for another round of comments.

I hope that no big thing surfaces again, though obviously there will be issues discovered later, but I feel like we have been thinking about it enough to be able to move forward.

Thanks a lot.

2018-03-16 16:10 GMT+01:00 Jesse Glick <[hidden email]>:
On Wed, Mar 14, 2018 at 1:55 PM, R. Tyler Croy <[hidden email]> wrote:
> core and plugin upgrades can result in
> modification of config.xml and other object-serialized-files on disk when an
> upgrade occurs.

Does happen, but rarely. In most cases, format changes take effect on
disk only when a `Saveable` object is in fact saved for some other
reason—a *Save* button in the UI, for example.

> This means an upgrade of Jenkins Essentials has a very real potential to cause
> irreversible modifications to files on disk which prevent a safe rollback.

This is true.

> "bricking" a Jenkins Essentials instance is a severe failure for the project

This is what needs to be defined much more carefully. What would cause
an installation to be “bricked”, exactly? Years of work by core devs
(see JIRA issues with label `robustness`) have solved most cases where
Jenkins would fail to start or be used in a basic capacity merely due
to unreadable configuration files. You might get *Discard Old Data*
warnings, of course, but these are not fatal.

> we need to prevent against irreversible modifications to files causing
> runtime failures

That is a much broader requirement, at least if “runtime failures”
could be interpreted as things like “the deployment stage in all my
pipelines started failing”, and it is not clear to me that the
proposal as it stands comes close to satisfying it.

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANfRfr2-jQ7HgKteHX%3DvyqPrCEHATD-q2QJwqE8ggJOYM%3D6Bcg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS5mWyWAhktJW%3DiiQqfHGe8PYYnggz1p7KTZF7%3DjB7Q4dA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

R. Tyler Croy
(replies inline)

On Wed, 21 Mar 2018, Baptiste Mathus wrote:

> FYI JEP now officially filed for review at
> https://github.com/jenkinsci/jep/pull/67


A friendly reminder from one of the JEP Editors, please keep the discussion on
this mailing list thread about the document.

At this stage of the game the only changes/edits on the pull request will
likely be copy edits rather than structure edits. By the end of the day I will
likely give this a number, and merge this as a `Draft` into repository, so this
PR is not a great place for design discussion :)

use the list, luke.



>
> Thank you everyone!
>
> 2018-03-20 23:21 GMT+01:00 Baptiste Mathus <[hidden email]>:
>
> > Hello everyone,
> >
> > Sorry for the time it took to get back here. I think I finally addressed
> > all comments.
> >
> > https://github.com/batmat/jep/pull/1 is ready for another round of
> > comments.
> >
> > I hope that no big thing surfaces again, though obviously there will be
> > issues discovered later, but I feel like we have been thinking about it
> > enough to be able to move forward.
> >
> > Thanks a lot.
> >
> > 2018-03-16 16:10 GMT+01:00 Jesse Glick <[hidden email]>:
> >
> >> On Wed, Mar 14, 2018 at 1:55 PM, R. Tyler Croy <[hidden email]>
> >> wrote:
> >> > core and plugin upgrades can result in
> >> > modification of config.xml and other object-serialized-files on disk
> >> when an
> >> > upgrade occurs.
> >>
> >> Does happen, but rarely. In most cases, format changes take effect on
> >> disk only when a `Saveable` object is in fact saved for some other
> >> reason???a *Save* button in the UI, for example.
> >>
> >> > This means an upgrade of Jenkins Essentials has a very real potential
> >> to cause
> >> > irreversible modifications to files on disk which prevent a safe
> >> rollback.
> >>
> >> This is true.
> >>
> >> > "bricking" a Jenkins Essentials instance is a severe failure for the
> >> project
> >>
> >> This is what needs to be defined much more carefully. What would cause
> >> an installation to be ???bricked???, exactly? Years of work by core devs
> >> (see JIRA issues with label `robustness`) have solved most cases where
> >> Jenkins would fail to start or be used in a basic capacity merely due
> >> to unreadable configuration files. You might get *Discard Old Data*
> >> warnings, of course, but these are not fatal.
> >>
> >> > we need to prevent against irreversible modifications to files causing
> >> > runtime failures
> >>
> >> That is a much broader requirement, at least if ???runtime failures???
> >> could be interpreted as things like ???the deployment stage in all my
> >> pipelines started failing???, and it is not clear to me that the
> >> proposal as it stands comes close to satisfying it.
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Jenkins Developers" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to [hidden email].
> >> To view this discussion on the web visit https://groups.google.com/d/ms
> >> gid/jenkinsci-dev/CANfRfr2-jQ7HgKteHX%3DvyqPrCEHATD-q2QJwqE8
> >> ggJOYM%3D6Bcg%40mail.gmail.com.
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS5mWyWAhktJW%3DiiQqfHGe8PYYnggz1p7KTZF7%3DjB7Q4dA%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.
- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>
     xmpp: [hidden email]

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/20180321143705.se4hb3q6tbdtdlyw%40blackberry.coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [JENKINS-49406] Evergreen snapshotting data safety system pre-JEP: feedback welcome

R. Tyler Croy
In reply to this post by Baptiste MATHUS
(replies inline)

On Wed, 21 Mar 2018, Baptiste Mathus wrote:

> FYI JEP now officially filed for review at
> https://github.com/jenkinsci/jep/pull/67


Just a heads up! *puts on JEP editor hat* I have marked this as a Draft and
assigned it the number JEP-302.

It can now be found here:
    https://github.com/jenkinsci/jep/tree/master/jep/302


If you have any concerns about this proposal or questions, please chime in on
this list before mid-next week. If it looks like batmat has addressed concerns
and there is consensus on this mailing list thread, I will update the status to
'Accepted'


Thanks batmat for your work on this design!



Cheers
- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>
     xmpp: [hidden email]

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/20180322183702.4v7tfvn6yj72vpr3%40blackberry.coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc (201 bytes) Download Attachment