I have a JSON file that is formatted something like
我有一个格式化的JSON文件
{
"unknown1":
[
{"text": "random text again",
"time": "Thu May 15 19:21:59 +0000 2016"},
"text": "akmfkdlm safsa fasffalmfa",
"time": "Thu May 21 09:53:51 +0000 2016"}
]
"unknown2":
[
"text": "fsda lmfalmfa",
"time": "Thu May 21 09:53:51 +0000 2016"},
]
}
The first item in the JSON is a random (unknown) label and there can be any number of these unknowns. Within these unknowns are always a bunch text
/time
pairings.
JSON中的第一项是随机(未知)标签,可以有任意数量的这些未知数。在这些未知数内总是有一堆文本/时间配对。
I am trying to send each text
into my REST post service which accepts JSON formatted to
我试图将每个文本发送到我的REST post服务,该服务接受格式化为的JSON
text: "foo bar bat",
mime_type: "text/html",
extract_type: "HP" # HP, MP
So I am getting this error when I try to run my code and I not sure what to do.
因此,当我尝试运行代码时,我收到此错误,但我不知道该怎么做。
Here is my code:
这是我的代码:
import json
import requests
with open('locations_stripped.json') as data_file:
data = json.load(data_file)
headers = {'Content-Type' : 'application/json'}
for thing in data:
for text, time in data.iteritems():
print text
body = [{ "text": text , "mime_type": "text/html", "extract_type": "HP"}]
r = requests.post('localhost:3003/api/extract/run', data=body, headers=headers)
print (r.content)
and here is the error:
这是错误:
$ python filterrest.py
unknown1
Traceback (most recent call last):
File "filterrest.py", line 30, in <module>
r = requests.post('localhost:3003/api/extract/run', data=body, headers=headers)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/api.py", line 111, in post
return request('post', url, data=data, json=json, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/api.py", line 57, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 461, in request
prep = self.prepare_request(req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 394, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/models.py", line 298, in prepare
self.prepare_body(data, files, json)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/models.py", line 452, in prepare_body
body = self._encode_params(data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/models.py", line 89, in _encode_params
for k, vs in to_key_val_list(data):
ValueError: too many values to unpack
One thing to note is it is printing the wrong text ("unknown1" instead of "random text again") and I am not sure how to get it to only print the text.
有一点需要注意的是它是打印错误的文本(“unknown1”而不是“随机文本”),我不知道如何让它只打印文本。
Any help on this?
对此有何帮助?
UPDATE
UPDATE
Per everyone's answer/comment I changed my code
根据每个人的回答/评论我改变了我的代码
...
for thing in data:
for text in data[thing]:
print text['text']
and this prints the text['text'] as I would expect. The issue lies in the way I am doing my request. I changed my code as a test and set the data to something that I know should work (I ran it via Postman).
这就像我期望的那样打印文本['text']。问题在于我在做我的请求的方式。我将我的代码更改为测试并将数据设置为我知道应该工作的内容(我通过Postman运行它)。
Changed code:
更改的代码:
r = requests.post('localhost:3003/api/extract/run', data='Hello. Where does the brown fox go?', headers=headers)
Expected Response:
预期回应:
[
{
"score": 0.30253747367501777,
"tag": "HP",
}
]
Instead what gets printed is what looks like an entire HTML page.
相反,打印出来的是整个HTML页面。
1
About the part 1 of your question:
关于你问题的第1部分:
for thing in data:
for text, time in data.iteritems():
With this loop you won't get the text
. Your update with:
使用此循环,您将无法获得文本。您的更新:
for thing in data:
for text in data[thing]:
print text['text']
is correct. Your headers
is right. The next problem is:
是正确的。你的标题是对的。下一个问题是:
body = [{ "text": text , "mime_type": "text/html", "extract_type": "HP"}]
r = requests.post('localhost:3003/api/extract/run', data=body, headers=headers)
Now look at the documentation of the module requests:
现在看一下模块请求的文档:
Typically, you want to send some form-encoded data — much like an HTML form. To do this, simply pass a dictionary to the data argument. Your dictionary of data will automatically be form-encoded when the request is made
通常,您希望发送一些表单编码数据 - 非常类似于HTML表单。为此,只需将字典传递给data参数即可。在发出请求时,您的数据字典将自动进行表单编码
There are many times that you want to send data that is not form-encoded. If you pass in a string instead of a dict, that data will be posted directly.
您有很多次要发送非表单编码的数据。如果传入字符串而不是dict,则会直接发布该数据。
For keyword parameter data
you must give either a dict
or a valid json str
. Your variable body
in the question is until a list
and in your update is a invalid json str
. There are 2 solutions:
对于关键字参数数据,您必须提供dict或有效的json str。问题中的变量体直到列表并且在更新中是无效的json str。有两种解决方案:
body = { "text": text , "mime_type": "text/html", "extract_type": "HP"}
# Dont't forget: dict will be used to send form-encoded data
# It will work. But not a intended solution for json data
r = requests.post('localhost:3003/api/extract/run', data=body, headers=headers)
Or
要么
import json
body = { "text": text , "mime_type": "text/html", "extract_type": "HP"}
r = requests.post('localhost:3003/api/extract/run', data=json.dumps(body), headers=headers)
But the requests doc says:
但请求文档说:
Instead of encoding the dict yourself, you can also pass it directly using the json parameter (added in version 2.4.2) and it will be encoded automatically
您也可以使用json参数(在版本2.4.2中添加)直接传递它而不是自己编码dict,它将自动编码
So from the version 2.4.2 it's better to use the keyword parameter json
instead of data
to send json data. So it's the best sollution:
因此,从版本2.4.2开始,最好使用关键字参数json而不是数据来发送json数据。所以这是最好的解决方案:
body = { "text": text , "mime_type": "text/html", "extract_type": "HP"}
r = requests.post('localhost:3003/api/extract/run', json=body, headers=headers)
Summarization
概要
I use this source:
我用这个来源:
{
"unknown1":
[
{"text": "random text again",
"time": "Thu May 15 19:21:59 +0000 2016"},
"text": "akmfkdlm safsa fasffalmfa",
"time": "Thu May 21 09:53:51 +0000 2016"}
],
"unknown2":
[
"text": "fsda lmfalmfa",
"time": "Thu May 21 09:53:51 +0000 2016"},
]
}
Code:
码:
import json
import requests
with open('locations_stripped.json') as data_file:
data = json.load(data_file)
headers = {'Content-Type' : 'application/json'}
for list_values in data.values():
for dict_element in list_values:
text = dict_element['text']
print text
body = { "text": text , "mime_type": "text/html", "extract_type": "HP"}
r = requests.post('localhost:3003/api/extract/run', json=body, headers=headers)
print (r.content)
P/S: I don't know your server, so i couldn't test it. I hope, that it works.
P / S:我不知道你的服务器,所以我无法测试它。我希望,它有效。
1
Assuming you have a valid json. You first need to traverse the list corresponding to "unknown" keys, now this list again contains dictionaries with text
time
keys.
假设你有一个有效的json。首先需要遍历与“未知”键对应的列表,现在该列表再次包含带有文本时间键的字典。
for unknown_key in data:
for obj in data[unknown_key]:
body = { "text": obj['text'] , "mime_type": "text/html", "extract_type": "HP"}
r = requests.post('localhost:3003/api/extract/run', data=body, headers=headers)
print (r.content)
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2016/05/26/6f37c839efbf048c55effcafd6a4a40c.html。