Substring replacement in string

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Substring replacement in string

Alrik Thiem-2
Dear R-help list,

I would like to replace all lower-case letters in a string that are not part
of certain fixed expressions. For example, I have the string:

"pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"

Where I would like to replace all lower-case letters that do not belong to
the functions "pmin" and "pmax" by 1 - toupper(...) to get

"pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"

Any ideas on how I could achieve that?

Many thanks and best wishes,

Alrik


********************************
Alrik Thiem
Post-Doctoral Researcher

Department of Philosophy
University of Geneva
Rue de Candolle 2
CH-1211 Geneva

+41 76 527 80 83

http://www.alrik-thiem.net
http://www.compasss.org

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

William Dunlap
If your string will always represent an R expression, you could work with
the expression directly with functions like all.names() and substitute().

f <- function (expr)
{
    toReplace <- setdiff(all.names(expr), c("pmin", "pmax"))
    toReplace <- grep(value = TRUE, "[a-z]", toReplace)
    names(toReplace) <- toReplace
    replacementList <- lapply(toReplace, function(name) call("-",
        1, as.name(toupper(name))))
    do.call(substitute, list(expr, replacementList))
}

> In <- quote(pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1)))
> Desired <- quote(pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1,
1 - Z1)))
> all.equal(Desired, f(In))
[1] TRUE





Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Feb 27, 2015 at 2:19 PM, Alrik Thiem <[hidden email]> wrote:

> Dear R-help list,
>
> I would like to replace all lower-case letters in a string that are not
> part
> of certain fixed expressions. For example, I have the string:
>
> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>
> Where I would like to replace all lower-case letters that do not belong to
> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>
> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>
> Any ideas on how I could achieve that?
>
> Many thanks and best wishes,
>
> Alrik
>
>
> ********************************
> Alrik Thiem
> Post-Doctoral Researcher
>
> Department of Philosophy
> University of Geneva
> Rue de Candolle 2
> CH-1211 Geneva
>
> +41 76 527 80 83
>
> http://www.alrik-thiem.net
> http://www.compasss.org
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Alrik Thiem-2
Many thanks. Unfortunately, I cannot work directly on these expressions since they’re only created from other strings. Would I first have to transform these strings to unevaluated expressions?
 
Von: William Dunlap [mailto:[hidden email]]
Gesendet: Freitag, 27. Februar 2015 23:39
An: Alrik Thiem
Cc: [hidden email]
Betreff: Re: [R] Substring replacement in string
 
If your string will always represent an R expression, you could work with
the expression directly with functions like all.names() and substitute().
 
f <- function (expr)
{
    toReplace <- setdiff(all.names(expr), c("pmin", "pmax"))
    toReplace <- grep(value = TRUE, "[a-z]", toReplace)
    names(toReplace) <- toReplace
    replacementList <- lapply(toReplace, function(name) call("-",
        1, as.name(toupper(name))))
    do.call(substitute, list(expr, replacementList))
}
 
> In <- quote(pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1)))
> Desired <- quote(pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1)))
> all.equal(Desired, f(In))
[1] TRUE
 
 
 
 


Bill Dunlap
TIBCO Software
wdunlap tibco.com
 
On Fri, Feb 27, 2015 at 2:19 PM, Alrik Thiem <[hidden email]> wrote:
Dear R-help list,

I would like to replace all lower-case letters in a string that are not part
of certain fixed expressions. For example, I have the string:

"pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"

Where I would like to replace all lower-case letters that do not belong to
the functions "pmin" and "pmax" by 1 - toupper(...) to get

"pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"

Any ideas on how I could achieve that?

Many thanks and best wishes,

Alrik


********************************
Alrik Thiem
Post-Doctoral Researcher

Department of Philosophy
University of Geneva
Rue de Candolle 2
CH-1211 Geneva

+41 76 527 80 83

http://www.alrik-thiem.net
http://www.compasss.org

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Gabor Grothendieck
In reply to this post by Alrik Thiem-2
On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:

> I would like to replace all lower-case letters in a string that are not part
> of certain fixed expressions. For example, I have the string:
>
> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>
> Where I would like to replace all lower-case letters that do not belong to
> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>
> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>

Assuming x is the input string:

gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Alrik Thiem-2
Dear Gabor,

Many thanks. Works like a charm, but I can't get it to work with

"pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"

i.e., with strings where there're no integers following the components in the pmin/pmax functions. Could this be generalized to handle both cases?

Best wishes,
Alrik

-----Ursprüngliche Nachricht-----
Von: Gabor Grothendieck [mailto:[hidden email]]
Gesendet: Samstag, 28. Februar 2015 13:35
An: Alrik Thiem
Cc: [hidden email]
Betreff: Re: [R] Substring replacement in string

On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:

> I would like to replace all lower-case letters in a string that are not part
> of certain fixed expressions. For example, I have the string:
>
> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>
> Where I would like to replace all lower-case letters that do not belong to
> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>
> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>

Assuming x is the input string:

gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Michael Dewey
Dear Alrik

This may seem a silly suggestion but why not just define new functions
PMIN and PMAX to call pmin and pmax. Obviously that does not solve your
problem if it is more general than your example.

On 28/02/2015 13:16, Alrik Thiem wrote:

> Dear Gabor,
>
> Many thanks. Works like a charm, but I can't get it to work with
>
> "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"
>
> i.e., with strings where there're no integers following the components in the pmin/pmax functions. Could this be generalized to handle both cases?
>
> Best wishes,
> Alrik
>
> -----Ursprüngliche Nachricht-----
> Von: Gabor Grothendieck [mailto:[hidden email]]
> Gesendet: Samstag, 28. Februar 2015 13:35
> An: Alrik Thiem
> Cc: [hidden email]
> Betreff: Re: [R] Substring replacement in string
>
> On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:
>> I would like to replace all lower-case letters in a string that are not part
>> of certain fixed expressions. For example, I have the string:
>>
>> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>>
>> Where I would like to replace all lower-case letters that do not belong to
>> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>>
>> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>>
>
> Assuming x is the input string:
>
> gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
> ## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"
>
>
>

--
Michael
http://www.dewey.myzen.co.uk

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Alrik Thiem-2
Dear Michael

I'm not sure how you mean this. Maybe a more general description of my problem is helpful for clarifying.

What I have to deal with are truth table output functions that always take, for example, the following form:

Delta <- "(a*B+a*C*d<=>Y)*(E+e)"

I.e. these functions will always have the structure (.*.+.*.+...<=>.)*(.+.), where the dots in the antecedent could be further conjunctions of unspecified complexity. I now need to filter all rows from the truth table that are compatible with this output function. To create the input part of the truth table "tt" for Delta above, I do:

library(QCA) # createMatrix() function is part of this package
Delta.upper <- toupper(Delta)
f.names <- unique(unlist(strsplit(Delta.upper, "[(*+<=>)]")))
f.names <- f.names[f.names != ""]
tt <- data.frame(createMatrix(rep(2, length(f.names))))
dimnames(tt) <- list(as.character(seq(2^length(f.names))), f.names)
tt

> tt
   A B C D Y E
1  0 0 0 0 0 0
2  0 0 0 0 0 1
.  . . . . . .
63 1 1 1 1 1 0
64 1 1 1 1 1 1

I now need to transform Delta into a string of the following form in order to extract the subset of compatible rows from "tt":

"pmin(pmax(pmin(1-tt$A,tt$B),pmin(1-tt$A,tt$C,1-tt$D))==tt$Y,pmax(tt$E,1-tt$E))==TRUE"

With this string, I can then do:

> tt[pmin(pmax(pmin(1-tt$A,tt$B), pmin(1-tt$A,tt$C,1-tt$D))==tt$Y,pmax(tt$E,1-tt$E))==TRUE, ]
   A B C D Y E
1  0 0 0 0 0 0
2  0 0 0 0 0 1
.  . . . . . .
61 1 1 1 1 0 0
62 1 1 1 1 0 1

-----Ursprüngliche Nachricht-----
Von: Michael Dewey [mailto:[hidden email]]
Gesendet: Samstag, 28. Februar 2015 14:50
An: Alrik Thiem
Cc: [hidden email]
Betreff: Re: [R] Substring replacement in string

Dear Alrik

This may seem a silly suggestion but why not just define new functions
PMIN and PMAX to call pmin and pmax. Obviously that does not solve your
problem if it is more general than your example.

On 28/02/2015 13:16, Alrik Thiem wrote:

> Dear Gabor,
>
> Many thanks. Works like a charm, but I can't get it to work with
>
> "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"
>
> i.e., with strings where there're no integers following the components in the pmin/pmax functions. Could this be generalized to handle both cases?
>
> Best wishes,
> Alrik
>
> -----Ursprüngliche Nachricht-----
> Von: Gabor Grothendieck [mailto:[hidden email]]
> Gesendet: Samstag, 28. Februar 2015 13:35
> An: Alrik Thiem
> Cc: [hidden email]
> Betreff: Re: [R] Substring replacement in string
>
> On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:
>> I would like to replace all lower-case letters in a string that are not part
>> of certain fixed expressions. For example, I have the string:
>>
>> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>>
>> Where I would like to replace all lower-case letters that do not belong to
>> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>>
>> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>>
>
> Assuming x is the input string:
>
> gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
> ## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"
>
>
>

--
Michael
http://www.dewey.myzen.co.uk

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

William Dunlap
In reply to this post by Alrik Thiem-2
  string <- "pmin(1, x)"
  expr <- parse(text=string)[[1]]

will convert the string to an unevaluated language object.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Feb 27, 2015 at 11:25 PM, Alrik Thiem <[hidden email]> wrote:

> Many thanks. Unfortunately, I cannot work directly on these expressions
> since they’re only created from other strings. Would I first have to
> transform these strings to unevaluated expressions?
>
>
>
> *Von:* William Dunlap [mailto:[hidden email]]
> *Gesendet**:* Freitag, 27. Februar 2015 23:39
> *An:* Alrik Thiem
> *Cc:* [hidden email]
> *Betreff:* Re: [R] Substring replacement in string
>
>
>
> If your string will always represent an R expression, you could work with
>
> the expression directly with functions like all.names() and substitute().
>
>
>
> f <- function (expr)
>
> {
>
>     toReplace <- setdiff(all.names(expr), c("pmin", "pmax"))
>
>     toReplace <- grep(value = TRUE, "[a-z]", toReplace)
>
>     names(toReplace) <- toReplace
>
>     replacementList <- lapply(toReplace, function(name) call("-",
>
>         1, as.name(toupper(name))))
>
>     do.call(substitute, list(expr, replacementList))
>
> }
>
>
>
> > In <- quote(pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1)))
>
> > Desired <- quote(pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y,
> pmax(Z1, 1 - Z1)))
>
> > all.equal(Desired, f(In))
>
> [1] TRUE
>
>
>
>
>
>
>
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
>
> On Fri, Feb 27, 2015 at 2:19 PM, Alrik Thiem <[hidden email]>
> wrote:
>
> Dear R-help list,
>
> I would like to replace all lower-case letters in a string that are not
> part
> of certain fixed expressions. For example, I have the string:
>
> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>
> Where I would like to replace all lower-case letters that do not belong to
> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>
> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>
> Any ideas on how I could achieve that?
>
> Many thanks and best wishes,
>
> Alrik
>
>
> ********************************
> Alrik Thiem
> Post-Doctoral Researcher
>
> Department of Philosophy
> University of Geneva
> Rue de Candolle 2
> CH-1211 Geneva
>
> +41 76 527 80 83
>
> http://www.alrik-thiem.net
> http://www.compasss.org
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Michael Dewey
In reply to this post by Alrik Thiem-2
Your original problem statement seemed to me to be one of wanting to
transform all the lower case identifiers to upper except for pmin and
pmax. My suggestion was not to bother and transform everything and then
define PMIN and PMAX.

On 28/02/2015 14:34, Alrik Thiem wrote:

> Dear Michael
>
> I'm not sure how you mean this. Maybe a more general description of my problem is helpful for clarifying.
>
> What I have to deal with are truth table output functions that always take, for example, the following form:
>
> Delta <- "(a*B+a*C*d<=>Y)*(E+e)"
>
> I.e. these functions will always have the structure (.*.+.*.+...<=>.)*(.+.), where the dots in the antecedent could be further conjunctions of unspecified complexity. I now need to filter all rows from the truth table that are compatible with this output function. To create the input part of the truth table "tt" for Delta above, I do:
>
> library(QCA) # createMatrix() function is part of this package
> Delta.upper <- toupper(Delta)
> f.names <- unique(unlist(strsplit(Delta.upper, "[(*+<=>)]")))
> f.names <- f.names[f.names != ""]
> tt <- data.frame(createMatrix(rep(2, length(f.names))))
> dimnames(tt) <- list(as.character(seq(2^length(f.names))), f.names)
> tt
>
>> tt
>     A B C D Y E
> 1  0 0 0 0 0 0
> 2  0 0 0 0 0 1
> .  . . . . . .
> 63 1 1 1 1 1 0
> 64 1 1 1 1 1 1
>
> I now need to transform Delta into a string of the following form in order to extract the subset of compatible rows from "tt":
>
> "pmin(pmax(pmin(1-tt$A,tt$B),pmin(1-tt$A,tt$C,1-tt$D))==tt$Y,pmax(tt$E,1-tt$E))==TRUE"
>
> With this string, I can then do:
>
>> tt[pmin(pmax(pmin(1-tt$A,tt$B), pmin(1-tt$A,tt$C,1-tt$D))==tt$Y,pmax(tt$E,1-tt$E))==TRUE, ]
>     A B C D Y E
> 1  0 0 0 0 0 0
> 2  0 0 0 0 0 1
> .  . . . . . .
> 61 1 1 1 1 0 0
> 62 1 1 1 1 0 1
>
> -----Ursprüngliche Nachricht-----
> Von: Michael Dewey [mailto:[hidden email]]
> Gesendet: Samstag, 28. Februar 2015 14:50
> An: Alrik Thiem
> Cc: [hidden email]
> Betreff: Re: [R] Substring replacement in string
>
> Dear Alrik
>
> This may seem a silly suggestion but why not just define new functions
> PMIN and PMAX to call pmin and pmax. Obviously that does not solve your
> problem if it is more general than your example.
>
> On 28/02/2015 13:16, Alrik Thiem wrote:
>> Dear Gabor,
>>
>> Many thanks. Works like a charm, but I can't get it to work with
>>
>> "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"
>>
>> i.e., with strings where there're no integers following the components in the pmin/pmax functions. Could this be generalized to handle both cases?
>>
>> Best wishes,
>> Alrik
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Gabor Grothendieck [mailto:[hidden email]]
>> Gesendet: Samstag, 28. Februar 2015 13:35
>> An: Alrik Thiem
>> Cc: [hidden email]
>> Betreff: Re: [R] Substring replacement in string
>>
>> On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:
>>> I would like to replace all lower-case letters in a string that are not part
>>> of certain fixed expressions. For example, I have the string:
>>>
>>> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>>>
>>> Where I would like to replace all lower-case letters that do not belong to
>>> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>>>
>>> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>
>>>
>>
>> Assuming x is the input string:
>>
>> gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
>> ## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"
>>
>>
>>
>

--
Michael
http://www.dewey.myzen.co.uk

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Alrik Thiem-2
Ah, I see what you mean. Thanks for suggesting. I'll try.

-----Ursprüngliche Nachricht-----
Von: Michael Dewey [mailto:[hidden email]]
Gesendet: Samstag, 28. Februar 2015 17:31
An: Alrik Thiem
Cc: [hidden email]
Betreff: Re: AW: [R] Substring replacement in string

Your original problem statement seemed to me to be one of wanting to
transform all the lower case identifiers to upper except for pmin and
pmax. My suggestion was not to bother and transform everything and then
define PMIN and PMAX.

On 28/02/2015 14:34, Alrik Thiem wrote:

> Dear Michael
>
> I'm not sure how you mean this. Maybe a more general description of my problem is helpful for clarifying.
>
> What I have to deal with are truth table output functions that always take, for example, the following form:
>
> Delta <- "(a*B+a*C*d<=>Y)*(E+e)"
>
> I.e. these functions will always have the structure (.*.+.*.+...<=>.)*(.+.), where the dots in the antecedent could be further conjunctions of unspecified complexity. I now need to filter all rows from the truth table that are compatible with this output function. To create the input part of the truth table "tt" for Delta above, I do:
>
> library(QCA) # createMatrix() function is part of this package
> Delta.upper <- toupper(Delta)
> f.names <- unique(unlist(strsplit(Delta.upper, "[(*+<=>)]")))
> f.names <- f.names[f.names != ""]
> tt <- data.frame(createMatrix(rep(2, length(f.names))))
> dimnames(tt) <- list(as.character(seq(2^length(f.names))), f.names)
> tt
>
>> tt
>     A B C D Y E
> 1  0 0 0 0 0 0
> 2  0 0 0 0 0 1
> .  . . . . . .
> 63 1 1 1 1 1 0
> 64 1 1 1 1 1 1
>
> I now need to transform Delta into a string of the following form in order to extract the subset of compatible rows from "tt":
>
> "pmin(pmax(pmin(1-tt$A,tt$B),pmin(1-tt$A,tt$C,1-tt$D))==tt$Y,pmax(tt$E,1-tt$E))==TRUE"
>
> With this string, I can then do:
>
>> tt[pmin(pmax(pmin(1-tt$A,tt$B), pmin(1-tt$A,tt$C,1-tt$D))==tt$Y,pmax(tt$E,1-tt$E))==TRUE, ]
>     A B C D Y E
> 1  0 0 0 0 0 0
> 2  0 0 0 0 0 1
> .  . . . . . .
> 61 1 1 1 1 0 0
> 62 1 1 1 1 0 1
>
> -----Ursprüngliche Nachricht-----
> Von: Michael Dewey [mailto:[hidden email]]
> Gesendet: Samstag, 28. Februar 2015 14:50
> An: Alrik Thiem
> Cc: [hidden email]
> Betreff: Re: [R] Substring replacement in string
>
> Dear Alrik
>
> This may seem a silly suggestion but why not just define new functions
> PMIN and PMAX to call pmin and pmax. Obviously that does not solve your
> problem if it is more general than your example.
>
> On 28/02/2015 13:16, Alrik Thiem wrote:
>> Dear Gabor,
>>
>> Many thanks. Works like a charm, but I can't get it to work with
>>
>> "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"
>>
>> i.e., with strings where there're no integers following the components in the pmin/pmax functions. Could this be generalized to handle both cases?
>>
>> Best wishes,
>> Alrik
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Gabor Grothendieck [mailto:[hidden email]]
>> Gesendet: Samstag, 28. Februar 2015 13:35
>> An: Alrik Thiem
>> Cc: [hidden email]
>> Betreff: Re: [R] Substring replacement in string
>>
>> On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:
>>> I would like to replace all lower-case letters in a string that are not part
>>> of certain fixed expressions. For example, I have the string:
>>>
>>> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>>>
>>> Where I would like to replace all lower-case letters that do not belong to
>>> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>>>
>>> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>
>>>
>>
>> Assuming x is the input string:
>>
>> gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
>> ## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"
>>
>>
>>
>

--
Michael
http://www.dewey.myzen.co.uk

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Gabor Grothendieck
In reply to this post by Alrik Thiem-2
Replace the + (i.e. 1 or more) in the pattern with a * (i.e. 0 or more):

   x <- "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"

   gsub("(\\b[a-oq-z][a-z0-9]*)", "1-\\U\\1", x, perl = TRUE)

giving:

   [1] "pmin(pmax(pmin(1-A,B),pmin(1-A,C,1-D))==Y,pmax(E,1-E))"

Here is a visualization of the regular expression:

   https://www.debuggex.com/i/5ByOCQS2zIdPEf-f.png


On Sat, Feb 28, 2015 at 8:16 AM, Alrik Thiem <[hidden email]> wrote:

> Dear Gabor,
>
> Many thanks. Works like a charm, but I can't get it to work with
>
> "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"
>
> i.e., with strings where there're no integers following the components in the pmin/pmax functions. Could this be generalized to handle both cases?
>
> Best wishes,
> Alrik
>
> -----Ursprüngliche Nachricht-----
> Von: Gabor Grothendieck [mailto:[hidden email]]
> Gesendet: Samstag, 28. Februar 2015 13:35
> An: Alrik Thiem
> Cc: [hidden email]
> Betreff: Re: [R] Substring replacement in string
>
> On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:
>> I would like to replace all lower-case letters in a string that are not part
>> of certain fixed expressions. For example, I have the string:
>>
>> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>>
>> Where I would like to replace all lower-case letters that do not belong to
>> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>>
>> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>>
>
> Assuming x is the input string:
>
> gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
> ## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Alrik Thiem-2
Dear Gabor,

That works perfectly!

Many thanks and best wishes,
Alrik

-----Ursprüngliche Nachricht-----
Von: Gabor Grothendieck [mailto:[hidden email]]
Gesendet: Samstag, 28. Februar 2015 19:30
An: Alrik Thiem
Cc: [hidden email]
Betreff: Re: [R] Substring replacement in string

Replace the + (i.e. 1 or more) in the pattern with a * (i.e. 0 or more):

   x <- "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"

   gsub("(\\b[a-oq-z][a-z0-9]*)", "1-\\U\\1", x, perl = TRUE)

giving:

   [1] "pmin(pmax(pmin(1-A,B),pmin(1-A,C,1-D))==Y,pmax(E,1-E))"

Here is a visualization of the regular expression:

   https://www.debuggex.com/i/5ByOCQS2zIdPEf-f.png


On Sat, Feb 28, 2015 at 8:16 AM, Alrik Thiem <[hidden email]> wrote:

> Dear Gabor,
>
> Many thanks. Works like a charm, but I can't get it to work with
>
> "pmin(pmax(pmin(a,B),pmin(a,C,d))==Y,pmax(E,e))"
>
> i.e., with strings where there're no integers following the components in the pmin/pmax functions. Could this be generalized to handle both cases?
>
> Best wishes,
> Alrik
>
> -----Ursprüngliche Nachricht-----
> Von: Gabor Grothendieck [mailto:[hidden email]]
> Gesendet: Samstag, 28. Februar 2015 13:35
> An: Alrik Thiem
> Cc: [hidden email]
> Betreff: Re: [R] Substring replacement in string
>
> On Fri, Feb 27, 2015 at 5:19 PM, Alrik Thiem <[hidden email]> wrote:
>> I would like to replace all lower-case letters in a string that are not part
>> of certain fixed expressions. For example, I have the string:
>>
>> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>>
>> Where I would like to replace all lower-case letters that do not belong to
>> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>>
>> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>>
>
> Assuming x is the input string:
>
> gsub("(\\b[a-oq-z][a-z0-9]+)", "1-\\U\\1", x, perl = TRUE)
> ## [1] "pmin(pmax(pmin(1-X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1-Z1))"
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Pages, Herve
In reply to this post by Alrik Thiem-2
Hi Alrik,

With the Biostrings/IRanges infrastructure (Bioconductor packages), you
can do this with:

   library(Biostrings)
   x0 <- BString("pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1,
z1))")
   donttouch_words <- c("pmin", "pmax")

   ## Extract the substrings to modify (target substrings).
   donttouch_regions <- reduce(do.call("c", lapply(donttouch_words,
matchPattern, x0)))
   target_regions <- ranges(gaps(donttouch_regions))
   target_substrings <- extractAt(x0, target_regions)

   ## Modify them.
   old <- paste0(letters, collapse="")
   new <- paste0(LETTERS, collapse="")
   target_substrings <- chartr(old, new, target_substrings)

   ## Replace in original string.
   x1 <- replaceAt(x0, target_regions, target_substrings)

Then:

   > x1
     57-letter "BString" instance
   seq: pmin(pmax(pmin(X1, X2), pmin(X3, X4)) == Y, pmax(Z1, Z1))

   > as.character(x1)
   [1] "pmin(pmax(pmin(X1, X2), pmin(X3, X4)) == Y, pmax(Z1, Z1))"

Hope this helps,
H.

On 02/27/2015 02:19 PM, Alrik Thiem wrote:

> Dear R-help list,
>
> I would like to replace all lower-case letters in a string that are not part
> of certain fixed expressions. For example, I have the string:
>
> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>
> Where I would like to replace all lower-case letters that do not belong to
> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>
> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>
> Any ideas on how I could achieve that?
>
> Many thanks and best wishes,
>
> Alrik
>
>
> ********************************
> Alrik Thiem
> Post-Doctoral Researcher
>
> Department of Philosophy
> University of Geneva
> Rue de Candolle 2
> CH-1211 Geneva
>
> +41 76 527 80 83
>
> http://www.alrik-thiem.net
> http://www.compasss.org
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Alrik Thiem-2
Dear Hervé,

Many thanks for your suggestion. Gabor Grothendieck proposed a simple
one-liner that works perfectly for my purposes:

gsub("(\\b[a-oq-z][a-z0-9]*)", "1-\\U\\1", x, perl = TRUE)

where x is the respective string.

Best wishes,
Alrik

-----Ursprüngliche Nachricht-----
Von: Hervé Pagès [mailto:[hidden email]]
Gesendet: Samstag, 28. Februar 2015 23:29
An: Alrik Thiem; [hidden email]
Betreff: Re: [R] Substring replacement in string

Hi Alrik,

With the Biostrings/IRanges infrastructure (Bioconductor packages), you
can do this with:

   library(Biostrings)
   x0 <- BString("pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1,
z1))")
   donttouch_words <- c("pmin", "pmax")

   ## Extract the substrings to modify (target substrings).
   donttouch_regions <- reduce(do.call("c", lapply(donttouch_words,
matchPattern, x0)))
   target_regions <- ranges(gaps(donttouch_regions))
   target_substrings <- extractAt(x0, target_regions)

   ## Modify them.
   old <- paste0(letters, collapse="")
   new <- paste0(LETTERS, collapse="")
   target_substrings <- chartr(old, new, target_substrings)

   ## Replace in original string.
   x1 <- replaceAt(x0, target_regions, target_substrings)

Then:

   > x1
     57-letter "BString" instance
   seq: pmin(pmax(pmin(X1, X2), pmin(X3, X4)) == Y, pmax(Z1, Z1))

   > as.character(x1)
   [1] "pmin(pmax(pmin(X1, X2), pmin(X3, X4)) == Y, pmax(Z1, Z1))"

Hope this helps,
H.

On 02/27/2015 02:19 PM, Alrik Thiem wrote:
> Dear R-help list,
>
> I would like to replace all lower-case letters in a string that are not
part

> of certain fixed expressions. For example, I have the string:
>
> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>
> Where I would like to replace all lower-case letters that do not belong to
> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>
> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>
> Any ideas on how I could achieve that?
>
> Many thanks and best wishes,
>
> Alrik
>
>
> ********************************
> Alrik Thiem
> Post-Doctoral Researcher
>
> Department of Philosophy
> University of Geneva
> Rue de Candolle 2
> CH-1211 Geneva
>
> +41 76 527 80 83
>
> http://www.alrik-thiem.net
> http://www.compasss.org
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Substring replacement in string

Pages, Herve
Hi Alrik,

On 02/28/2015 11:06 PM, Alrik Thiem wrote:
> Dear Hervé,
>
> Many thanks for your suggestion. Gabor Grothendieck proposed a simple
> one-liner that works perfectly for my purposes:
>
> gsub("(\\b[a-oq-z][a-z0-9]*)", "1-\\U\\1", x, perl = TRUE)
>
> where x is the respective string.

Sounds good. I didn't realize that you also wanted to prefix the lower
case letters with "1 - " so my solution was not doing the right thing
anyway. Here is the corrected solution, just in case:

   library(Biostrings)

   funnyReplace <- function(x, protected_words)
   {
     x <- BString(x)

     ## Extract the substrings to modify (target substrings).
     protected_regions <- reduce(do.call("c", lapply(protected_words,
matchPattern, x)))
     target_regions <- ranges(gaps(protected_regions))
     target_substrings <- extractAt(x, target_regions)

     ## Modify them (using a reg exp almost like Gabbor's except
     ## that "p" is not treated as an exception).
     target_substrings <- gsub("(\\b[a-z][a-z0-9]*)", "1 - \\U\\1",
target_substrings, perl=TRUE)

     ## Replace in original string.
     x <- replaceAt(x, target_regions, target_substrings)
     as.character(x)
}

Then:

   > x <- "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
   > funnyReplace(x, c("pmin", "pmax"))
   [1] "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"

It works even if a variable name starts with a "p":

   > funnyReplace("pmin(p)", c("pmin", "pmax"))
   [1] "pmin(1 - P)"

and you can specify an arbitrary number of protected words.

Cheers,
H.

>
> Best wishes,
> Alrik
>
> -----Ursprüngliche Nachricht-----
> Von: Hervé Pagès [mailto:[hidden email]]
> Gesendet: Samstag, 28. Februar 2015 23:29
> An: Alrik Thiem; [hidden email]
> Betreff: Re: [R] Substring replacement in string
>
> Hi Alrik,
>
> With the Biostrings/IRanges infrastructure (Bioconductor packages), you
> can do this with:
>
>     library(Biostrings)
>     x0 <- BString("pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1,
> z1))")
>     donttouch_words <- c("pmin", "pmax")
>
>     ## Extract the substrings to modify (target substrings).
>     donttouch_regions <- reduce(do.call("c", lapply(donttouch_words,
> matchPattern, x0)))
>     target_regions <- ranges(gaps(donttouch_regions))
>     target_substrings <- extractAt(x0, target_regions)
>
>     ## Modify them.
>     old <- paste0(letters, collapse="")
>     new <- paste0(LETTERS, collapse="")
>     target_substrings <- chartr(old, new, target_substrings)
>
>     ## Replace in original string.
>     x1 <- replaceAt(x0, target_regions, target_substrings)
>
> Then:
>
>     > x1
>       57-letter "BString" instance
>     seq: pmin(pmax(pmin(X1, X2), pmin(X3, X4)) == Y, pmax(Z1, Z1))
>
>     > as.character(x1)
>     [1] "pmin(pmax(pmin(X1, X2), pmin(X3, X4)) == Y, pmax(Z1, Z1))"
>
> Hope this helps,
> H.
>
> On 02/27/2015 02:19 PM, Alrik Thiem wrote:
>> Dear R-help list,
>>
>> I would like to replace all lower-case letters in a string that are not
> part
>> of certain fixed expressions. For example, I have the string:
>>
>> "pmin(pmax(pmin(x1, X2), pmin(X3, X4)) == Y, pmax(Z1, z1))"
>>
>> Where I would like to replace all lower-case letters that do not belong to
>> the functions "pmin" and "pmax" by 1 - toupper(...) to get
>>
>> "pmin(pmax(pmin(1 - X1, X2), pmin(X3, X4)) == Y, pmax(Z1, 1 - Z1))"
>>
>> Any ideas on how I could achieve that?
>>
>> Many thanks and best wishes,
>>
>> Alrik
>>
>>
>> ********************************
>> Alrik Thiem
>> Post-Doctoral Researcher
>>
>> Department of Philosophy
>> University of Geneva
>> Rue de Candolle 2
>> CH-1211 Geneva
>>
>> +41 76 527 80 83
>>
>> http://www.alrik-thiem.net
>> http://www.compasss.org
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.