alternate licensing for package data?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

alternate licensing for package data?

bbolker
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


  Does anyone have speculations about the implications of the GPL for
data included in a package, or more generally for restricting use of data?

  The specific use case is that I have a package which is otherwise
GPL (version unspecified at present).  There are various data sets
included, but they are all essentially in the public domain.  I'm
thinking about including another data set, but the original author of
that data might like to impose some reasonable restrictions (e.g.
please don't use in an academic publication without explicit
permission ...)  Would such rules be expected to be compatible with
CRAN rules?  Will having the package be "GPL except for file XXX, see
LICENSE" mess things up horribly?

  I can of course make the data available for download and include a
link, and/or make a special package that contains only these data, but
it would seem to be more convenient for end users, and more
future-proof, to put everything in one place.

  I know I will eventually need to take this up with CRAN, but I'm
looking for reasonably informed opinions/suggestions ...

  cheers
    Ben Bolker


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJVNtvYAAoJEOCV5YRblxUHh90H/0GgmeF1wzRmPYndxYRWXegv
bKlkmibRBvUwfBsv1BzPmiQ08Hs+eZp4NnP6Wh7TigfAZlvkl8hq0rHr/RWzY+XT
Fo8xkeuydVk3vxSdunHpl10gnGDjb845MSigL+W7X587xAY5wmB9+QzuudNaIL2U
URR+jp3OG0Np1mJQX/7lVMi34L71cT7jZKTaBiFLzYJB1x0RvE+xXqGoj+NcNVqA
zYjUWyYCzPfCJJVCI+DsbLUgnWKTYYsWEq1lWabE2HKfqio2pInbQSOtdw6s3VX/
kwvQU4WOhJYHedmzNNsWBnpm04gbIrK71i9FN7Iw5kNQjbsqAN7YPZ6rD1GsjGE=
=Gos/
-----END PGP SIGNATURE-----

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: alternate licensing for package data?

Martyn Plummer-3
I think this is covered well by the CRAN repository policy:
http://cran.r-project.org/web/packages/policies.html 

The two key license requirements are that:
1) CRAN must have a perpetual license to distribute the package
2) The package license should be listed here:
https://svn.r-project.org/R/trunk/share/licenses/license.db
Packages with licenses not included in that list are generally not
accepted.

However, there are exceptions, and you can find some by searching for
"non-commercial site:cran.r-project.org" on Google. See also section
1.1.2 of the Writing R Extensions manual for an explanation.

Personally, I would not want to add the extra complexity to a package
that is otherwise GPL.

Martyn

On Tue, 2015-04-21 at 19:23 -0400, Ben Bolker wrote:

>   Does anyone have speculations about the implications of the GPL for
> data included in a package, or more generally for restricting use of data?
>
>   The specific use case is that I have a package which is otherwise
> GPL (version unspecified at present).  There are various data sets
> included, but they are all essentially in the public domain.  I'm
> thinking about including another data set, but the original author of
> that data might like to impose some reasonable restrictions (e.g.
> please don't use in an academic publication without explicit
> permission ...)  Would such rules be expected to be compatible with
> CRAN rules?  Will having the package be "GPL except for file XXX, see
> LICENSE" mess things up horribly?
>
>   I can of course make the data available for download and include a
> link, and/or make a special package that contains only these data, but
> it would seem to be more convenient for end users, and more
> future-proof, to put everything in one place.
>
>   I know I will eventually need to take this up with CRAN, but I'm
> looking for reasonably informed opinions/suggestions ...
>
>   cheers
>     Ben Bolker
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-----------------------------------------------------------------------
This message and its attachments are strictly confidenti...{{dropped:8}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: alternate licensing for package data?

Roger Bivand
Martyn Plummer <plummerm <at> iarc.fr> writes:

>
> I think this is covered well by the CRAN repository policy:
> http://cran.r-project.org/web/packages/policies.html 
>
> The two key license requirements are that:
> 1) CRAN must have a perpetual license to distribute the package
> 2) The package license should be listed here:
> https://svn.r-project.org/R/trunk/share/licenses/license.db
> Packages with licenses not included in that list are generally not
> accepted.
>
...

>
> Personally, I would not want to add the extra complexity to a package
> that is otherwise GPL.
>
> Martyn
>
> On Tue, 2015-04-21 at 19:23 -0400, Ben Bolker wrote:
> >   Does anyone have speculations about the implications of the GPL for
> > data included in a package, or more generally for restricting use of data?
> >
...
While I agree with Martyn with respect to code, documentation, and
vignettes, the point Ben raises is relevant and not obvious. Data sets in
say GLP-licensed packages are on occasion challenged by Debian packagers
where it isn't obvious that GPL is appropriate. Some spatial packages are
not accepted for packaging as is because of included data, data that is
needed to run realistic examples.

The problem could be picky packagers, but it is also reasonable that
well-known example data sets could be licensed differently.
share/licenses/license.db lists for example CC BY-SA 4.0 as both FOSS and
extensible but free_and_GPLv3_incompatible. One possibility I examined when
challenged was to place all such data files in a separate package, for
example under a CC license accepted by CRAN - I didn't complete the task,
but understand Ben's question as applying to the same question.

Roger

> >
> >   cheers
> >     Ben Bolker
> >
> >

>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Roger Bivand
Department of Economics
NHH Norwegian School of Economics
Helleveien 30
N-5045 Bergen, Norway
Reply | Threaded
Open this post in threaded view
|

Re: alternate licensing for package data?

braverock
On Wed, 2015-04-22 at 11:34 +0000, Roger Bivand wrote:

> Martyn Plummer <plummerm <at> iarc.fr> writes:
>
> >
> > I think this is covered well by the CRAN repository policy:
> > http://cran.r-project.org/web/packages/policies.html 
> >
> > The two key license requirements are that:
> > 1) CRAN must have a perpetual license to distribute the package
> > 2) The package license should be listed here:
> > https://svn.r-project.org/R/trunk/share/licenses/license.db
> > Packages with licenses not included in that list are generally not
> > accepted.
> >
> ...
> >
> > Personally, I would not want to add the extra complexity to a package
> > that is otherwise GPL.
> >
> > Martyn
> >
> > On Tue, 2015-04-21 at 19:23 -0400, Ben Bolker wrote:
> > >   Does anyone have speculations about the implications of the GPL for
> > > data included in a package, or more generally for restricting use of data?
> > >
> ...
> While I agree with Martyn with respect to code, documentation, and
> vignettes, the point Ben raises is relevant and not obvious. Data sets in
> say GLP-licensed packages are on occasion challenged by Debian packagers
> where it isn't obvious that GPL is appropriate. Some spatial packages are
> not accepted for packaging as is because of included data, data that is
> needed to run realistic examples.
>
> The problem could be picky packagers, but it is also reasonable that
> well-known example data sets could be licensed differently.
> share/licenses/license.db lists for example CC BY-SA 4.0 as both FOSS and
> extensible but free_and_GPLv3_incompatible. One possibility I examined when
> challenged was to place all such data files in a separate package, for
> example under a CC license accepted by CRAN - I didn't complete the task,
> but understand Ben's question as applying to the same question.

It is also clearly possible to license data files differently than the
package.  GPL is copyleft for compiled code.  R data files are not
compiled/linked into the package, they are included in a tarball or zip
file.  As such, the copyleft provision of GPL doesn't necessarily apply
to non-compiled files in the package collection, and isn't necessarily
intended to apply (the Gnu licenses page suggests not using GPL for
data).

Whether CRAN or Debian packagers would accept a open but mixed code/data
license scheme is not for me to say, but I don't see any impediments
from the licenses themselves.


--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: alternate licensing for package data?

Dirk Eddelbuettel
In reply to this post by Roger Bivand

On 22 April 2015 at 11:34, Roger Bivand wrote:
| While I agree with Martyn with respect to code, documentation, and
| vignettes, the point Ben raises is relevant and not obvious. Data sets in
| say GLP-licensed packages are on occasion challenged by Debian packagers

Not generally the packagers (who get frustrated by this like everybody else)
but by the "ftp-masters" teams who look over what gets into the Archive.

They are the license reviewers, and gate-keepers.

In several cases we (ie "packagers") had to write README.sources to document
origins of datasets.  That is generally a little silly as ... R itself
already enforces in the .Rd files. So for the packages where I had to do that
the README.sources effectively becomes a forward reference to the R docs.
But then again the ftp-masters review _thousands_ of packages and having to
help their workflow is a small burden.

In general, nitpicky licensing issue have been discussed (to mindnumbing
length) on the debian-legal list. Those interested in the issue may want to
peruse or search the archive:
    http://news.gmane.org/gmane.linux.debian.devel.legal

Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: alternate licensing for package data?

bbolker
Dirk Eddelbuettel <edd <at> debian.org> writes:

> On 22 April 2015 at 11:34, Roger Bivand wrote:
> | While I agree with Martyn with respect to code, documentation, and
> | vignettes, the point Ben raises is relevant and not obvious. Data sets in
> | say GLP-licensed packages are on occasion challenged by Debian packagers

        [GPL]
 

> Not generally the packagers (who get frustrated by this like everybody else)
> but by the "ftp-masters" teams who look over what gets into the Archive.
>
> They are the license reviewers, and gate-keepers.
>
> In several cases we (ie "packagers") had to write README.sources to document
> origins of datasets.  That is generally a little silly as ... R itself
> already enforces in the .Rd files. So for the packages where I had to do that
> the README.sources effectively becomes a forward reference to the R docs.
> But then again the ftp-masters review _thousands_ of packages and having to
> help their workflow is a small burden.
>
> In general, nitpicky licensing issue have been discussed (to mindnumbing
> length) on the debian-legal list. Those interested in the issue may want to
> peruse or search the archive:
>     http://news.gmane.org/gmane.linux.debian.devel.legal
>
> Dirk

Thanks for the information, everyone!  I think I'm just going to
handle it the sloppy way, providing a .Rd file containing
documentation and a URL for the data set.  This is not particularly
good for long-term maintenance, but it seems silly to try to get a
separate package onto CRAN for a *single* (small) data set.

  For what it's worth, I've been informed by the CRAN maintainers
that

> 'license' is singular in the CRAN policies, something people
  sometimes overlook.
> A package must have a single licence that applies to all of the
  package (even if alternative licences are offered for all or part),
  so "GPL except for file XXX" is not viable.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: alternate licensing for package data?

Avraham Adler
To get around this last problem, perhaps you can take advantage of
CRAN's suggestion regarding large data files [1] where they say
"[w]here a large amount of data is required (even after compression),
consideration should be given to a separate data-only package which
can be updated only rarely (since older versions of packages are
archived in perpetuity)." This would allow you to have a different
license for the data package than for the main package. Whether CRAN
will except a file + LICENSE with "please only use academically" or a
CC-BY-NC, is a different question.

Avi

[1] <http://cran.r-project.org/web/packages/policies.html#Source-packages>

On Wed, Apr 22, 2015 at 4:49 PM, Ben Bolker <[hidden email]> wrote:

> Dirk Eddelbuettel <edd <at> debian.org> writes:
>
>> On 22 April 2015 at 11:34, Roger Bivand wrote:
>> | While I agree with Martyn with respect to code, documentation, and
>> | vignettes, the point Ben raises is relevant and not obvious. Data sets in
>> | say GLP-licensed packages are on occasion challenged by Debian packagers
>
>         [GPL]
>
>> Not generally the packagers (who get frustrated by this like everybody else)
>> but by the "ftp-masters" teams who look over what gets into the Archive.
>>
>> They are the license reviewers, and gate-keepers.
>>
>> In several cases we (ie "packagers") had to write README.sources to document
>> origins of datasets.  That is generally a little silly as ... R itself
>> already enforces in the .Rd files. So for the packages where I had to do that
>> the README.sources effectively becomes a forward reference to the R docs.
>> But then again the ftp-masters review _thousands_ of packages and having to
>> help their workflow is a small burden.
>>
>> In general, nitpicky licensing issue have been discussed (to mindnumbing
>> length) on the debian-legal list. Those interested in the issue may want to
>> peruse or search the archive:
>>     http://news.gmane.org/gmane.linux.debian.devel.legal
>>
>> Dirk
>
> Thanks for the information, everyone!  I think I'm just going to
> handle it the sloppy way, providing a .Rd file containing
> documentation and a URL for the data set.  This is not particularly
> good for long-term maintenance, but it seems silly to try to get a
> separate package onto CRAN for a *single* (small) data set.
>
>   For what it's worth, I've been informed by the CRAN maintainers
> that
>
>> 'license' is singular in the CRAN policies, something people
>   sometimes overlook.
>> A package must have a single licence that applies to all of the
>   package (even if alternative licences are offered for all or part),
>   so "GPL except for file XXX" is not viable.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel