GMail PHP API无法获得电子邮件的正文
我在Gmail PHP API中遇到了麻烦。
我想要检索电子邮件的正文内容,但我可以检索它只有具有附件的电子邮件! 问题是为什么?
这是我的代码到目前为止:
// Authentication things above... $client = getClient(); $gmail = new Google_Service_Gmail($client); $list = $gmail->users_messages->listUsersMessages('me', ['maxResults' => 1000]); while ($list->getMessages() != null) { foreach ($list->getMessages() as $mlist) { $message_id = $mlist->id; $optParamsGet2['format'] = 'full'; $single_message = $gmail->users_messages->get('me', $message_id, $optParamsGet2); $threadId = $single_message->getThreadId(); $payload = $single_message->getPayload(); $headers = $payload->getHeaders(); $parts = $payload->getParts(); //print_r($parts); PRINTS SOMETHING ONLY IF I GOT ATTACHMENTS... $body = $parts[0]['body']; $rawData = $body->data; $sanitizedData = strtr($rawData,'-_', '+/'); $decodedMessage = base64_decode($sanitizedData); //should display my body content } if ($list->getNextPageToken() != null) { $pageToken = $list->getNextPageToken(); $list = $gmail->users_messages->listUsersMessages('me', ['pageToken' => $pageToken, 'maxResults' => 1000]); } else { break; } }
检索我知道的内容的第二个选项是使用位于Headers部分的片段,但它只检索50个第一个字符,这是不是很有用。
我们来做一个小实验。 我已经给自己发了两封邮件。 一个有附件,一个没有。
请求:
GET https://www.googleapis.com/gmail/v1/users/me/messages?maxResults=2
响应:
{ "messages": [ { "id": "14fe21fd6b3fb46f", "threadId": "14fe21fd6b3fb46f" }, { "id": "14fe21f9341ed73c", "threadId": "14fe21f9341ed73c" } ], "nextPageToken": "08943597140129624594", "resultSizeEstimate": 3 }
我只要求有效载荷,因为那里是所有相关的部分:
fields = payload GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21fd6b3fb46f?fields=payload GET https://www.googleapis.com/gmail/v1/users/me/messages/14fe21f9341ed73c?fields=payload
邮件无附件:
{ "payload": { "parts": [ { "partId": "0", "mimeType": "text/plain", "filename": "", "headers": [ { "name": "Content-Type", "value": "text/plain; charset=UTF-8" } ], "body": { "size": 22, "data": "aGVjaz8gTm8gYXR0YWNobWVudD8NCg==" } }, { "partId": "1", "mimeType": "text/html", "filename": "", "headers": [ { "name": "Content-Type", "value": "text/html; charset=UTF-8" } ], "body": { "size": 43, "data": "PGRpdiBkaXI9Imx0ciI-aGVjaz8gTm8gYXR0YWNobWVudD88L2Rpdj4NCg==" } } ] } }
邮件附件:
{ "payload": { "parts": [ { "mimeType": "multipart/alternative", "filename": "", "headers": [ { "name": "Content-Type", "value": "multipart/alternative; boundary=001a1142e23c551e8e05200b4be0" } ], "body": { "size": 0 }, "parts": [ { "partId": "0.0", "mimeType": "text/plain", "filename": "", "headers": [ { "name": "Content-Type", "value": "text/plain; charset=UTF-8" } ], "body": { "size": 9, "data": "V293IG1hbg0K" } }, { "partId": "0.1", "mimeType": "text/html", "filename": "", "headers": [ { "name": "Content-Type", "value": "text/html; charset=UTF-8" } ], "body": { "size": 30, "data": "PGRpdiBkaXI9Imx0ciI-V293IG1hbjwvZGl2Pg0K" } } ] }, { "partId": "1", "mimeType": "image/jpeg", "filename": "feelthebern.jpg", "headers": [ { "name": "Content-Type", "value": "image/jpeg; name=\"feelthebern.jpg\"" }, { "name": "Content-Disposition", "value": "attachment; filename=\"feelthebern.jpg\"" }, { "name": "Content-Transfer-Encoding", "value": "base64" }, { "name": "X-Attachment-Id", "value": "f_ieq3ev0i0" } ], "body": { "attachmentId": "ANGjdJ_2xG3WOiLh6MbUdYy4vo2VhV2kOso5AyuJW3333rbmk8BIE1GJHIOXkNIVGiphP3fGe7iuIl_MGzXBGNGvNslwlz8hOkvJZg2DaasVZsdVFT_5JGvJOLefgaSL4hqKJgtzOZG9K1XSMrRQAtz2V0NX7puPdXDU4gvalSuMRGwBhr_oDSfx2xljHEbGG6I4VLeLZfrzGGKW7BF-GO_FUxzJR8SizRYqIhgZNA6PfRGyOhf1s7bAPNW3M9KqWRgaK07WTOYl7DzW4hpNBPA4jrl7tgsssExHpfviFL7yL52lxsmbsiLe81Z5UoM", "size": 100446 } } ] } }
这些响应对应于你的代码中的$parts
。 正如你所看到的,如果你幸运的话, $parts[0]['body']->data
会给你你想要的,但是大部分时间不会。
这个问题通常有两种方法。 你可以实现下面的algorithm(你比PHP更好,但是这是它的一般概要):
- 遍历
payload.parts
并检查它是否包含你正在寻找的部分(text/plain
或text/html
)。 如果有的话,你已经完成了你的search。 如果你正在parsing一个上面没有附件的邮件,这就足够了。 - 再次执行第1步,但是这次是用刚刚检查的
parts
内部的parts
recursion地执行。 你最终会find你的part
。 如果你用附件parsing上面的邮件,最终会find你的body
。
该algorithm可能看起来像下面的例子(在JavaScript中的例子):
var response = { "payload": { "parts": [ { "mimeType": "multipart/alternative", "filename": "", "headers": [ { "name": "Content-Type", "value": "multipart/alternative; boundary=001a1142e23c551e8e05200b4be0" } ], "body": { "size": 0 }, "parts": [ { "partId": "0.0", "mimeType": "text/plain", "filename": "", "headers": [ { "name": "Content-Type", "value": "text/plain; charset=UTF-8" } ], "body": { "size": 9, "data": "V293IG1hbg0K" } }, { "partId": "0.1", "mimeType": "text/html", "filename": "", "headers": [ { "name": "Content-Type", "value": "text/html; charset=UTF-8" } ], "body": { "size": 30, "data": "PGRpdiBkaXI9Imx0ciI-V293IG1hbjwvZGl2Pg0K" } } ] }, { "partId": "1", "mimeType": "image/jpeg", "filename": "feelthebern.jpg", "headers": [ { "name": "Content-Type", "value": "image/jpeg; name=\"feelthebern.jpg\"" }, { "name": "Content-Disposition", "value": "attachment; filename=\"feelthebern.jpg\"" }, { "name": "Content-Transfer-Encoding", "value": "base64" }, { "name": "X-Attachment-Id", "value": "f_ieq3ev0i0" } ], "body": { "attachmentId": "ANGjdJ_2xG3WOiLh6MbUdYy4vo2VhV2kOso5AyuJW3333rbmk8BIE1GJHIOXkNIVGiphP3fGe7iuIl_MGzXBGNGvNslwlz8hOkvJZg2DaasVZsdVFT_5JGvJOLefgaSL4hqKJgtzOZG9K1XSMrRQAtz2V0NX7puPdXDU4gvalSuMRGwBhr_oDSfx2xljHEbGG6I4VLeLZfrzGGKW7BF-GO_FUxzJR8SizRYqIhgZNA6PfRGyOhf1s7bAPNW3M9KqWRgaK07WTOYl7DzW4hpNBPA4jrl7tgsssExHpfviFL7yL52lxsmbsiLe81Z5UoM", "size": 100446 } } ] } }; // In eg a plain text message, the payload is the only part. var parts = [response.payload]; while (parts.length) { var part = parts.shift(); if (part.parts) { parts = parts.concat(part.parts); } if(part.mimeType === 'text/html') { var decodedPart = decodeURIComponent(escape(atob(part.body.data.replace(/\-/g, '+').replace(/\_/g, '/')))); console.log(decodedPart); } }
更新:你可能想检查我的第二个答案下面这个更完整的代码。
最后,我今天工作了,所以这里是查找主体的完整代码答案 – 感谢@Tholle :
// Authentication things above /* * Decode the body. * @param : encoded body - or null * @return : the body if found, else FALSE; */ function decodeBody($body) { $rawData = $body; $sanitizedData = strtr($rawData,'-_', '+/'); $decodedMessage = base64_decode($sanitizedData); if(!$decodedMessage){ $decodedMessage = FALSE; } return $decodedMessage; } $client = getClient(); $gmail = new Google_Service_Gmail($client); $list = $gmail->users_messages->listUsersMessages('me', ['maxResults' => 1000]); try{ while ($list->getMessages() != null) { foreach ($list->getMessages() as $mlist) { $message_id = $mlist->id; $optParamsGet2['format'] = 'full'; $single_message = $gmail->users_messages->get('me', $message_id, $optParamsGet2); $payload = $single_message->getPayload(); // With no attachment, the payload might be directly in the body, encoded. $body = $payload->getBody(); $FOUND_BODY = decodeBody($body['data']); // If we didn't find a body, let's look for the parts if(!$FOUND_BODY) { $parts = $payload->getParts(); foreach ($parts as $part) { if($part['body']) { $FOUND_BODY = decodeBody($part['body']->data); break; } // Last try: if we didn't find the body in the first parts, // let's loop into the parts of the parts (as @Tholle suggested). if($part['parts'] && !$FOUND_BODY) { foreach ($part['parts'] as $p) { // replace 'text/html' by 'text/plain' if you prefer if($p['mimeType'] === 'text/html' && $p['body']) { $FOUND_BODY = decodeBody($p['body']->data); break; } } } if($FOUND_BODY) { break; } } } // Finally, print the message ID and the body print_r($message_id . " : " . $FOUND_BODY); } if ($list->getNextPageToken() != null) { $pageToken = $list->getNextPageToken(); $list = $gmail->users_messages->listUsersMessages('me', ['pageToken' => $pageToken, 'maxResults' => 1000]); } else { break; } } } catch (Exception $e) { echo $e->getMessage(); }
正如你所看到的那样,我的问题是,有时在有效载荷 – >零件中找不到正文,而直接在有效载荷 – >正文中find正文 ! (加上我添加了多个部分的循环)。
希望这可以帮助别人。
对于那些有兴趣的人,我大大提高了我的最后一个答案,使其与文本/ HTML(并回退到文本/平原,如果有必要)和图像转换为base64附件,将打印为完整的HTML时自动加载!
代码是不完美的,是太长时间来详细解释,但它是为我工作。
随意采取和适应它(也许正确/如有必要改进)。
// Authentication things above /* * Decode the body. * @param : encoded body - or null * @return : the body if found, else FALSE; */ function decodeBody($body) { $rawData = $body; $sanitizedData = strtr($rawData,'-_', '+/'); $decodedMessage = base64_decode($sanitizedData); if(!$decodedMessage){ $decodedMessage = FALSE; } return $decodedMessage; } $client = getClient(); $gmail = new Google_Service_Gmail($client); $list = $gmail->users_messages->listUsersMessages('me', ['maxResults' => 1000]); try{ while ($list->getMessages() != null) { foreach ($list->getMessages() as $mlist) { $message_id = $mlist->id; $optParamsGet2['format'] = 'full'; $single_message = $gmail->users_messages->get('me', $message_id, $optParamsGet2); $payload = $single_message->getPayload(); $parts = $payload->getParts(); // With no attachment, the payload might be directly in the body, encoded. $body = $payload->getBody(); $FOUND_BODY = FALSE; // If we didn't find a body, let's look for the parts if(!$FOUND_BODY) { foreach ($parts as $part) { if($part['parts'] && !$FOUND_BODY) { foreach ($part['parts'] as $p) { if($p['parts'] && count($p['parts']) > 0){ foreach ($p['parts'] as $y) { if(($y['mimeType'] === 'text/html') && $y['body']) { $FOUND_BODY = decodeBody($y['body']->data); break; } } } else if(($p['mimeType'] === 'text/html') && $p['body']) { $FOUND_BODY = decodeBody($p['body']->data); break; } } } if($FOUND_BODY) { break; } } } // let's save all the images linked to the mail's body: if($FOUND_BODY && count($parts) > 1){ $images_linked = array(); foreach ($parts as $part) { if($part['filename']){ array_push($images_linked, $part); } else{ if($part['parts']) { foreach ($part['parts'] as $p) { if($p['parts'] && count($p['parts']) > 0){ foreach ($p['parts'] as $y) { if(($y['mimeType'] === 'text/html') && $y['body']) { array_push($images_linked, $y); } } } else if(($p['mimeType'] !== 'text/html') && $p['body']) { array_push($images_linked, $p); } } } } } // special case for the wdcid... preg_match_all('/wdcid(.*)"/Uims', $FOUND_BODY, $wdmatches); if(count($wdmatches)) { $z = 0; foreach($wdmatches[0] as $match) { $z++; if($z > 9){ $FOUND_BODY = str_replace($match, 'image0' . $z . '@', $FOUND_BODY); } else { $FOUND_BODY = str_replace($match, 'image00' . $z . '@', $FOUND_BODY); } } } preg_match_all('/src="cid:(.*)"/Uims', $FOUND_BODY, $matches); if(count($matches)) { $search = array(); $replace = array(); // let's trasnform the CIDs as base64 attachements foreach($matches[1] as $match) { foreach($images_linked as $img_linked) { foreach($img_linked['headers'] as $img_lnk) { if( $img_lnk['name'] === 'Content-ID' || $img_lnk['name'] === 'Content-Id' || $img_lnk['name'] === 'X-Attachment-Id'){ if ($match === str_replace('>', '', str_replace('<', '', $img_lnk->value)) || explode("@", $match)[0] === explode(".", $img_linked->filename)[0] || explode("@", $match)[0] === $img_linked->filename){ $search = "src=\"cid:$match\""; $mimetype = $img_linked->mimeType; $attachment = $gmail->users_messages_attachments->get('me', $mlist->id, $img_linked['body']->attachmentId); $data64 = strtr($attachment->getData(), array('-' => '+', '_' => '/')); $replace = "src=\"data:" . $mimetype . ";base64," . $data64 . "\""; $FOUND_BODY = str_replace($search, $replace, $FOUND_BODY); } } } } } } } // If we didn't find the body in the last parts, // let's loop for the first parts (text-html only) if(!$FOUND_BODY) { foreach ($parts as $part) { if($part['body'] && $part['mimeType'] === 'text/html') { $FOUND_BODY = decodeBody($part['body']->data); break; } } } // With no attachment, the payload might be directly in the body, encoded. if(!$FOUND_BODY) { $FOUND_BODY = decodeBody($body['data']); } // Last try: if we didn't find the body in the last parts, // let's loop for the first parts (text-plain only) if(!$FOUND_BODY) { foreach ($parts as $part) { if($part['body']) { $FOUND_BODY = decodeBody($part['body']->data); break; } } } if(!$FOUND_BODY) { $FOUND_BODY = '(No message)'; } // Finally, print the message ID and the body print_r($message_id . ": " . $FOUND_BODY); } if ($list->getNextPageToken() != null) { $pageToken = $list->getNextPageToken(); $list = $gmail->users_messages->listUsersMessages('me', ['pageToken' => $pageToken, 'maxResults' => 1000]); } else { break; } } } catch (Exception $e) { echo $e->getMessage(); }
干杯。
我写了这个代码作为@ F3L1X79的答案的改进,因为这正确地过滤html响应。
<?php ini_set("display_errors", 1); ini_set("track_errors", 1); ini_set("html_errors", 1); error_reporting(E_ALL); require_once __DIR__ . '/vendor/autoload.php'; session_start(); function decodeBody($body) { $rawData = $body; $sanitizedData = strtr($rawData,'-_', '+/'); $decodedMessage = base64_decode($sanitizedData); if(!$decodedMessage){ $decodedMessage = FALSE; } return $decodedMessage; } function fetchMails($gmail, $q) { try{ $list = $gmail->users_messages->listUsersMessages('me', array('q' => $q)); while ($list->getMessages() != null) { foreach ($list->getMessages() as $mlist) { $message_id = $mlist->id; $optParamsGet2['format'] = 'full'; $single_message = $gmail->users_messages->get('me', $message_id, $optParamsGet2); $payload = $single_message->getPayload(); // With no attachment, the payload might be directly in the body, encoded. $body = $payload->getBody(); $FOUND_BODY = decodeBody($body['data']); // If we didn't find a body, let's look for the parts if(!$FOUND_BODY) { $parts = $payload->getParts(); foreach ($parts as $part) { if($part['body'] && $part['mimeType'] == 'text/html') { $FOUND_BODY = decodeBody($part['body']->data); break; } } } if(!$FOUND_BODY) { foreach ($parts as $part) { // Last try: if we didn't find the body in the first parts, // let's loop into the parts of the parts (as @Tholle suggested). if($part['parts'] && !$FOUND_BODY) { foreach ($part['parts'] as $p) { // replace 'text/html' by 'text/plain' if you prefer if($p['mimeType'] === 'text/html' && $p['body']) { $FOUND_BODY = decodeBody($p['body']->data); break; } } } if($FOUND_BODY) { break; } } } // Finally, print the message ID and the body print_r($message_id . " <br> <br> <br> *-*-*- " . $FOUND_BODY); } if ($list->getNextPageToken() != null) { $pageToken = $list->getNextPageToken(); $list = $gmail->users_messages->listUsersMessages('me', array('pageToken' => $pageToken)); } else { break; } } } catch (Exception $e) { echo $e->getMessage(); } } $client = new Google_Client(); $client->setAuthConfig('client_secrets.json'); $client->addScope(Google_Service_Gmail::GMAIL_READONLY); if (isset($_SESSION['access_token']) && $_SESSION['access_token']) { $client->setAccessToken($_SESSION['access_token']); $gmail = new Google_Service_Gmail($client); $q = ' after:2016/11/7'; fetchMails($gmail, $q); } else { $redirect_uri = 'http://' . $_SERVER['HTTP_HOST'] . '/gmail-api/oauth2callback.php'; header('Location: ' . filter_var($redirect_uri, FILTER_SANITIZE_URL)); }