Blogger Importer (Blogger 內容匯入程式)

外掛說明

Blogger Importer 能協助使用者將 Google Blogger 網站的資料匯入至 WordPress 網站中。

可匯入的項目

  • 分類 (即 Blogger 的標籤)
  • 文章 (已發表、已排程或草稿均可)
  • 留言 (僅會匯入未標示為垃圾留言的項目)
  • 圖片

無法匯入的項目

  • 網頁
  • 小工具/小工具資料
  • 範本/主題
  • 單則留言及作者個人頭像

參考專案

  • https://www.simplepie.org/

下列參考專案用於實作圖片及連結的匯入運作方式

  • https://wordpress.org/plugins/remote-images-grabber/
  • http://notions.okuda.ca/wordpress-plugins/blogger-image-import/
  • https://wordpress.org/plugins/cache-images/
  • https://wordpress.org/plugins/tumblr-importer/
  • https://core.trac.wordpress.org/ticket/14525
  • https://wpengineer.com/1735/easier-better-solutions-to-get-pictures-on-your-posts/
  • https://web.archive.org/web/20211121020918/http://www.velvetblues.com/web-development-blog/wordpress-plugin-update-urls/
  • http://wordpress.stackexchange.com/questions//media-sideload-image-file-name (not working)
  • https://code.tutsplus.com/a-guide-to-the-wordpress-http-api-the-basics–wp-25125t

已知問題

  • Some users have reported that their IFrames are stripped out of the post content.
  • Requests for better performance of larger transfers and tranfers of images
  • Review of behavior when it re-imports, partiularly are the counts correct
  • Review using get_posts or get_comments with the appropriate parameters to get the counts and exists instead of using SQL
  • Incorrect notice, PHP Notice: The data could not be converted to UTF-8. You MUST have either the iconv or mbstring extension installed. This occurs even when Iconv is installed, could be related to Blogger reporting 0 comments
  • When the importer is running it’s not possible to stop it using the stop button
  • Blogger’s count of comments include those not linked to a post e.g. the post has been deleted.

篩選器及動作

使用者可以新增這些動作及篩選器以擴充匯入程式的功能,而不需修改程式碼。

import_start 動作:匯入程式開始處理新的部落格記錄後,便會執行這個動作。

import_done 動作:匯入程式完成處理部落格記錄後,便會執行這個動作。

Filter – blogger_importer_congrats – Passes the list of options shown to the user when the blog is complete, options can be added or removed.

螢幕擷圖

安裝方式

  1. blogger-importer 資料夾上傳至 /wp-content/plugins/ 目錄。
  2. 在 WordPress 管理後台的 [外掛] 選單中啟用外掛。

先決條件

The importer connects your server to the blogger server to copy across the posts. For this to work you need to have connectivity from the server to the internet and also have at least one of the remote access protocols enabled, e.g. curl, streams or fsockopen. You can use the Core Control plugin to test if these are working correctly. The importer connects to Google over a secure connection so OpenSSL needs to be enabled on your server.
The importer uses the SimplePie classes to read and process the data from blogger so you will need the php-xml module installed on your webserver.

準備工作

強烈建議網站管理員在進行內容匯入前,先行停用 Blogger 內容匯入程式之外的所有外掛及快取功能

先行停用其他外掛與快取功能,能確保資料傳輸過程的順利,以便正確匯入文章及留言。

Blogger 內容匯入方式

  1. 登入用於 Blogger 的 Google 帳戶,前往要匯出內容的 Blogger 網站的 [設定]→[其他],然後點擊 [匯出內容] 按鈕。這項操作會下載一支包含網站所有文章及留言的 XML 檔案。
  2. 在 WordPress 網站中,Blogger 內容匯入程式會出現在 [工具]→[匯入] 選單中。
  3. 上傳之前匯出的 XML 檔案至 WordPress 網站。
  4. 系統讀取文章後,網站管理員可依需求為匯入的文章指派合適的作者。
  5. 等待匯入程序完成。
  6. 如果匯入程序中途失敗,直接再次匯入即可,匯入程式會略過已匯入的文章,不會產生重複內容。

常見問題集

如何再次匯入內容?

只要再次上傳 XML 檔案即可。匯入程式會略過已匯入的文章,不會產生重複內容。

Blogger 的內容完成匯入後,需要保留這個外掛嗎?

不需要。Blogger 的內容完成匯入後,網站管理員便可移除這個外掛。

如何得知哪些文章已匯入?

Each of the posts loaded is tagged with a meta tags indicating where the posts were loaded from. The permalink will be set to the visible URL if the post was published or the internal ID if it was still a draft or scheduled post

  • blogger_author
  • blogger_blog
  • blogger_permalink

After importing there are a lot of categories

Blogger does not distinguish between tags and categories so you will likely want to review what was imported and then use the categories to tags converter

關於 Blogger 的 [網頁] 項目

這個內容匯入程式無法處理 Blogger 的 [網頁] 項目,必須由使用者進行手動移轉。

關於 Blogger 的圖片

This version of the importer imports these too, but you can disable this via a setting in the blogger-importer.php file. Tracking images of size 1×1 are not processed. If you with to specifically exclude other images you could code something for the image_filter function.

What size are the images?

The importer will attempt to download the a large version of the file if one is available. This is controlled by the setting “LARGE_IMAGE_SIZE” and defaults to a width of 1024. The display size of the images is the “medium” size of images as defined on WordPress. You can change this in advance if you want to show a different size.

如何得知哪些圖片遭到略過?

If you hover over the progress bar for images it will tell you how many images are skipped. To see the filenames of these images you will need to enable WordPress debugging to log to file. See https://wordpress.org/documentation/article/debugging-in-wordpress/

關於 Blogger 上已排程的文章

The scheduled posts will be transferred and will be published as specified. However, Blogger and WordPress handle drafts differently, WordPress does not support dates on draft posts so you will need to use a plugin if you wish to plan your writing schedule.

文章的永久連結是否相同?

不會相同,因為 WordPress 及 Blogger 處理永久連結的方式完全不同。但是,WordPress 網站管理員可以使用重新導向外掛或為網站的 .htaccess 檔案進行編輯,將舊的網址對應至新的網址。

My posts and comments moved across but some things are stripped out

The importer uses the SimplePie classes to process the data, these in turn use a Simplepie_Sanitize class to remove potentially malicious code from the source data. If the php-xml module is not installed then this may result in your entire comment text being stripped out and the error “PHP Warning: DOMDocument not found, unable to use sanitizer” to appear in your logs.

The comments don’t have avatars

This is a known limitation of the data that is provided from Blogger. The WordPress system uses Gravatar to provide the images for the comment avatars. This relies the email of the person making the comment. Blogger does not provide the email address in the data feed so WordPress does not display the correct images. You can manually update or script change to the comment email addresses to work around this issue.

It does not seem to be processing the images

The most common reasons for this are lack of memory and timeouts, these should appear in your error log. Also check you’ve not run out of disk space on your server. Because WordPress stores the files in multiple resolutions one image might take up as much as 250kb spread across 5 files of different sizes.

How do I make the images bigger or smaller? / My images are fuzzy

The importer will attempt to download a large version of images but it displays them on the blog at the medium size. If you go into your settings->media options then you can display a different size “medium” image by default. You can’t make this bigger than the file that has been downloaded which is where the next setting comes in.

The default size for the large images is 1024, you can change this to an even larger size by changing the following line in the blogger-import.php file.

const LARGE_IMAGE_SIZE = ‘1024’;

The file downloaded won’t be bigger than the origional file so if it was only 800×600 to start with then it won’t be any bigger than that.

If your origional blog has hardcoded width and height values that are larger than the medium size settings then that might result in your images becoming fuzzy.

I’ve run out of disk space processing the images

The importer is designed to download the high resolution images where they are available. You can either disable the downloading of images or you can change the constant LARGE_IMAGE_SIZE string in the blogger-importer.php file to swap the links with a smaller image.

使用者評論

2022 年 12 月 4 日
I am migrating from Blogger to Wordpress, and keeping the host name the same (I owned the domain blogger was hosting on, so I can just subtly switch to WP). The posts came over well, but the exported Blogger data included the permalink URL and this importer dropped that, so the URLs will change although I can keep the host name of the URL the same. I'm looking for any tool that can import with the permalinks included.
2020 年 6 月 14 日
I managed to use this to migrate a tiny old blog from Blogger to WordPress in June 2020 successfully-ish. All of the published and draft posts from Blogger are now in WordPress. All images in the posts were uploaded to the WordPress Media Library; the img tags in the posts were not updated to point to the new Media Library uploads. All comments remained intact -- the author replacement only affected posts, not comment responses. All Blogger labels were turned into WordPress categories.
2020 年 1 月 7 日 1 則留言
Running the Blogger Importer in PHP 7 throws the following warning: Deprecated: Methods with the same name as their class will not be constructors in a future version of PHP; Blogger_Importer has a deprecated constructor in /xxxxxx/wp-content/plugins/blogger-importer/blogger-importer.php on line 44 Another user reported this months ago, but it's not fixed yet.
閱讀全部 52 則使用者評論

參與者及開發者

以下人員參與了開源軟體〈Blogger Importer (Blogger 內容匯入程式)〉的開發相關工作。

參與者

〈Blogger Importer (Blogger 內容匯入程式)〉外掛目前已有 28 個本地化語言版本。 感謝全部譯者為這個外掛做出的貢獻。

將〈Blogger Importer (Blogger 內容匯入程式)〉外掛本地化為台灣繁體中文版

對開發相關資訊感興趣?

任何人均可瀏覽程式碼、查看 SVN 存放庫,或透過 RSS 訂閱開發記錄

變更記錄

0.9.2

  • Add support for WordPress 6.2

0.9.1

  • Add support for WordPress 6.1

0.9

  • Complete rewrite to use XML files instead.

0.8

  • Fixed issue with the authors form not showing a the list of authors for a blog
  • Simplified check for duplicate comments
  • Code simplified for get_authors and get_author_form
  • Fixed issue with wpdb prepare and integer keys by switching to a sub select query
  • Make comment handling more robust
  • Simplified functions to reduce messages in the log

0.7

  • Fixed issue with drafts not being imported in the right state
  • Added extra error handling for get_oauth_link to stop blank tokens being sent to the form
  • Restructured code to keep similar steps in single function and to allow testing of components to be done
  • Re-incorporated the “congrats” function and provided a sensible list of what to do next
  • Add a geo_public flag to posts with geotags
  • Dropped _normalize_tag after confirming that it’s handled by SimplePie
  • Added image handling https://core.trac.wordpress.org/ticket/4010
  • Added setting author on images
  • Added error handling in get_oauth_link() as suggested by daniel_henrique ref https://core.trac.wordpress.org/ticket/21163
  • Added a check for OpenSSL as suggested by digitalsensus
  • Fixed issue with SimplePie santizer not getting set in WordPress 3.5
  • Added filter for the congrats function ‘blogger_importer_congrats’ so other plugins can add in new options
  • Converted manual HTML table to WP_LIST_TABLE
  • Moved inline Javascript to separate file to aid debugging and testing
  • Wrapped data sent to Javascript in I18n functions.
  • Fixed timeout error in the Javascript, timeouts were not being used.
  • Supress post revisions when importing so that DB does not grow
  • Added processing of internal links
  • Added uninstall.php to remove options on uninstall
  • Added a timeout value to all of the wp_remote_get calls as people have reported timeout issues
  • Added a setting to control the large images downloaded from blogger.
  • Stopped logging all the post and comment IDs in arrays and storing in option this improved the importing of very large blogs
  • Fixed issue with comment_author_IP notice
  • Code restructuring to use classes for blog objects
  • Changed AJAX calls to use technique described here https://codex.wordpress.org/AJAX_in_Plugins#Ajax_on_the_Administration_Side
  • Added AdminURL to the greet function rather than hardcoded path
  • Defaulted to turn off post pingbacks
  • Fix to stop it counting pingbacks, issue reported by realdoublebee
  • Retrofitted Security enhancement from 0.6, nonce added to form buttons on main screen
  • Security enhancement, nonce added to form button on authors screen
  • Updated POT file
  • Greek Translation from Stergatou Eleni https://buddypress.org/community/members/lenasterg/

0.6

  • Security enhancement, nonce added to form button on main screen

0.5

  • Merged in fix by SergeyBiryukov https://core.trac.wordpress.org/ticket/16012
  • Merged in rmccue change to get_total_results to also use SimplePie from https://core.trac.wordpress.org/attachment/ticket/7652/7652-blogger.diff
  • Reviewed in rmccue’s changes in https://core.trac.wordpress.org/attachment/ticket/7652/7652-separate.diff issues with date handling functions so skipped those
  • Moved SimplePie functions in new class WP_SimplePie_Blog_Item incorporating get_draft_status and get_updated and convert date
  • Tested comments from source blog GMT-8, destination London (currently GMT-1), comment dates transferred correctly.
  • Fixed typo in oauth_get
  • Added screen_icon() to all pages
  • Added GeoTags as per spec on https://codex.wordpress.org/Geodata
  • Change by Otto42, rmccue to use Simplepie XML processing rather than Atomparser, https://core.trac.wordpress.org/ticket/14525 ref: https://core.trac.wordpress.org/attachment/ticket/7652/7652-blogger.diff
    this also fixes https://core.trac.wordpress.org/ticket/15560
  • Change by Otto42 to use OAuth rather than AuthSub authentication, should make authentication more reliable
  • Fix by Andy from Workshopshed to load comments and nested comments correctly
  • Fix by Andy from Workshopshed to correctly pass the blogger start-index and max-results parameters to oAuth functions and to process more than one batch https://core.trac.wordpress.org/ticket/19096
  • Fix by Andy from Workshopshed error about incorrect enqueuing of scripts also changed styles to work the same
  • Change by Andy from Workshopshed testing in debug mode and wrapped ajax return into a function to suppress debug messages
  • Fix by Andy from Workshopshed notices for undefined variables.
  • Change by Andy from Workshopshed Added tooltip to results table to show numbers of posts and comments skipped (duplicates / missing key)
  • Fix by Andy from Workshopshed incorrectly checking for duplicates based on only the date and username, this gave false positives when large numbers of comments, particularly anonymous ones.

0.4

  • Fix for tracking images being added by Blogger to non-authenticated feeds https://core.trac.wordpress.org/ticket/17623

0.3

  • Bugfix for 403 Invalid AuthSub Token https://core.trac.wordpress.org/ticket/14629

0.1

  • Initial release