R:替换双重转义文本
我正在使用Amazon Elastic Map Reduce命令行工具将许多系统调用粘在一起。这些命令返回已经(部分?)转义的JSON文本。然后,当系统调用将其转换为R文本对象(实习生= T)时,它似乎再次被转义。我需要清理它,以便它将使用rjson包进行解析。
我这样做系统调用:
system("~/EMR/elastic-mapreduce --describe --jobflow j-2H9P770Z4B8GG", intern=T)
返回:
[1] "{"
[2] " "JobFlows": ["
[3] " {"
[4] " "LogUri": "s3n:\/\/emrlogs\/","
[5] " "Name": "emrFromR","
[6] " "BootstrapActions": ["
...
但命令行中的相同命令返回:
{
"JobFlows": [
{
"LogUri": "s3n://emrlogs/",
"Name": "emrFromR",
"BootstrapActions": [
{
"BootstrapActionConfig": {
...
如果我尝试通过rjson运行系统调用的结果,我收到此错误:
Error: '/' is an unrecognized escape in character string starting "s3n:/"
我相信这是因为s3n系列的双重逃逸。我正在努力将这个文本按摩到可以解析的东西。
它可能就像用“”替换“\”一样简单,但由于我有点与正则表达式和逃避斗争,我无法正确完成。
那么如何获取字符串向量并用“”替换任何出现的“\”? (即使输入这个问题我也不得不使用三个反斜杠代表两个)这个特定用例的其他任何提示?
这是我的代码更详细:
> library(rjson)
> emrJson <- paste(system("~/EMR/elastic-mapreduce --describe --jobflow j-2H9P770Z4B8GG", intern=T))
>
> parser <- newJSONParser()
> for (i in 1:length(emrJson)){
+ parser$addData(emrJson[i])
+ }
>
> parser$getObject()
Error: '/' is an unrecognized escape in character string starting "s3n:/"
如果你想重新创建emrJson对象,这里是dput()输出:
> dput(emrJson)
c("{", " "JobFlows": [", " {", " "LogUri": "s3n:\/\/emrlogs\/",",
" "Name": "emrFromR",", " "BootstrapActions": [",
" {", " "BootstrapActionConfig": {", " "Name": "Bootstrap 0",",
" "ScriptBootstrapAction": {", " "Path": "s3:\/\/rtmpfwblrx\/bootstrap.sh",",
" "Args": []", " }", " }",
" }", " ],", " "ExecutionStatusDetail": {",
" "EndDateTime": 1278124414.0,", " "CreationDateTime": 1278123795.0,",
" "LastStateChangeReason": "Steps completed",", " "State": "COMPLETED",",
" "StartDateTime": 1278124000.0,", " "ReadyDateTime": 1278124237.0",
" },", " "Steps": [", " {", " "StepConfig": {",
" "ActionOnFailure": "CANCEL_AND_WAIT",", " "Name": "Example Streaming Step",",
" "HadoopJarStep": {", " "MainClass": null,",
" "Jar": "\/home\/hadoop\/contrib\/streaming\/hadoop-0.18-streaming.jar",",
" "Args": [", " "-input",", " "s3n:\/\/rtmpfwblrx\/stream.txt",",
" "-output",", " "s3n:\/\/rtmpfwblrxout\/",",
" "-mapper",", " "s3n:\/\/rtmpfwblrx\/mapper.R",",
" "-reducer",", " "cat",",
" "-cacheFile",", " "s3n:\/\/rtmpfwblrx\/emrData.RData#emrData.RData"",
" ],", " "Properties": []", " }",
" },", " "ExecutionStatusDetail": {", " "EndDateTime": 1278124322.0,",
" "CreationDateTime": 1278123795.0,", " "LastStateChangeReason": null,",
" "State": "COMPLETED",", " "StartDateTime": 1278124232.0",
" }", " }", " ],", " "JobFlowId": "j-2H9P770Z4B8GG",",
" "Instances": {", " "Ec2KeyName": "JL 09282009",",
" "InstanceCount": 2,", " "Placement": {",
" "AvailabilityZone": "us-east-1d"", " },",
" "KeepJobFlowAliveWhenNoSteps": false,", " "SlaveInstanceType": "m1.small",",
" "MasterInstanceType": "m1.small",", " "MasterPublicDnsName": "ec2-174-129-70-89.compute-1.amazonaws.com",",
" "MasterInstanceId": "i-2147b84b",", " "InstanceGroups": null,",
" "HadoopVersion": "0.18"", " }", " }", " ]",
"}")
没有找到相关结果
已邀请:
2 个回复
贸会
在这里使用你的输出输出。
寒健