Introducing pdmreader

PowerDesigner is a powerful and popular data modeling tool, but it’s too expensive to afford. As a developer, I have to consume the PDM artifacts created by my colleagues, so switching to another tool is not a solution.

Read more >>

SubGit - 用 Git 与 SVN 仓库交互

目前在公司参与的项目都是用 SVN 做版本控制的,然而我非常不喜欢 SVN:

Read more >>

Upgrading my blog to Spring Boot 2.0

Spring Boot 2.0 was finally released on May 1, 2018, and my blog has been upgraded to it from Spring Boot 1.5.9.

Spring Boot 2.0 is a major update of 17 months' work. It breaks some configurations due to refactoring and dependency updates along with it.

Package removable/rename:

  • spring-boot-starter-mobile starter is removed.
  • spring-session should be replaced by spring-session-data-redis.

Gradle plugin updates:

  • Dependency management plugin is no longer automatically applied, and should be explicitly enabled.
  • The bootRepackage task is replaced by bootJar, and as a result jar task is no longer not invoked when building executable jars. This breaks configuration for jar task.

Dependency updates:

  • Hibernate validator from 5.3.6 to 6.0.7: org.hibernate.validator.constraints.NotEmpty is deprecated and javax.validation.constraints.NotEmpty should be used.
  • Flyway: Spring Boot's default flyway.table has been changed from schema_version to flyway_schema_history.
  • spring-data-commons from 1.13 to 2.0: Configuration for 1-based pagination has to be changed.

Refactoring:

  • AbstractErrorController, ErrorViewResolver are moved from package org.springframework.boot.autoconfigure.web to org.springframework.boot.autoconfigure.web.servlet.error.
  • FreeMarkerAutoConfiguration.FreeMarkerWebConfiguration is replaced by a package-private class, breaking the configuration that extends the class.
Read more >>

Soft hyphen (0xAD) - 看不见的连字符

今天在阅读一篇英文文章 The Last JSON Spec 时,发现自己写的划词翻译工具 popup-dict 对很多词无效,比如 "ir­ra­tional"。

细查之下,发现 "ir­ra­tional" 虽然看起来只有 10 个字符,但 "ir­ra­tional".length 却返回 12。第三个字符的 char code 是 173(16进制表示就是 0xAD),这不是字母 "r"。整个字符串不匹配英文单词/合成词/句子的正则表达式,因此被 popup-dict 忽略。

搜索发现,0xAD 是 soft hyphen,中文大概翻译为软连字符。这个字符用于在排版时显式建议换行位置,一般不可见,但如果需要换行,就可以从这个字符处换行,并渲染为一个可见的连字符(渲染成什么样与语言有关,HTML 的 lang 属性)。主流浏览器基本都支持。

测试发现,soft hyphen 对中文无效。相关的 zero-width space<wbr> 也都对中文无效。所以也就对英文有点用了。

了解了 soft hyphen,那么划词翻译无效的问题也就好解决了,匹配前直接去掉就好了。

text = text.replace('\xad', '')
Read more >>

popup-dict - Linux 下的划词翻译工具

查看英文网页或阅读英文电子书时,划词翻译是个非常实用的功能。

之前一直在用 youdao-dict-for-ubuntu,这是个基于 Python 2 + Gtk+ 2 + webkit 编写的小工具,简单实用。但也有些不足:

  • Python 2 和 Gtk+ 2 都是过时的技术
  • Python 2, Gtk+ 2, webkit 三者的 binding 依赖太多,又不在 Arch 的官方源里
  • 功能有些小问题
    • 重复选择同一个单词不会触发翻译
    • 弹窗位置固定在鼠标下方,有时显示不完整
    • 一点击弹窗中的链接,弹窗就全白了,必须重启才能解决

所以一直想使用 Python 3 + Gtk+ 3 重写。之前尝试过在 youdao-dict-for-ubuntu 基础上改进,不过由于对 Python 和 Gtk+ 都不熟悉,没什么进展。最近 youdao-dict-for-ubuntu 的依赖出现过两次动态链接错误,已经无法使用了,我也不想再修,于是有了 popup-dict

popup-dict screenshot

popup-dict 有以下特点:

  • 使用 Python 3 + Gtk+ 3 开发,使用 Gtk+ 原生 UI
  • 系统依赖少,只依赖 Python 3 + Gtk+ 3 + PyGObject
  • 使用多线程技术,主线程不阻塞,响应快
  • 封装查询接口,便于切换不同翻译服务
  • 弹窗位置自适应,保证显示完整
  • 弹窗中显示链接,点击打开在线词典
  • 方便打开/禁用词典,避免不需要的时候打扰(通过 Gnome Shell Extension popup-dict-switcher 实现一键开/关)

目前还只支持有道智云的翻译服务,功能也还有些不足,会逐渐完善。

这个工具主要还是为满足我个人的使用环境、需求(Arch Linux + Gnome 3,主要用于浏览器和 PDF 阅读器),同时也作为一个练手、学习项目。有时间有精力的话会尽量做完善,没有的话就只满足个人需要了。

Read more >>

JPA many-to-many update efficiency

When updating many-to-many relationships, the SQL executed by JPA may be quite inefficient.

Suppose we have a Post-Tag association: each post can have multiple tags, and each tag can also has multiple posts.

postidtagidpost_tagidpost_idtag_id

Java code:

// Post.java
@Entity
public class Post {
  @Id
  private Integer id;
  
  @ManyToMany(cascade = {CascadeType.PERSIST, CascadeType.MERGE})
  @JoinTable(
    name = "post_tag",
    joinColumns = @JoinColumn(name = "post_id"),
    inverseJoinColumns = @JoinColumn(name = "tag_id")
  )
  private List<Tag> tags = new ArrayList<>();

  // getters and setters

  public void addTag(Tag tag) {
    tags.add(tag);
    tag.getPosts().add(this);
  }

  public void removeTag(Tag tag) {
    tags.remove(tag);
    tag.getPosts().remove(this);
  }
}

// Tag.java
@Entity
public class Tag {
  @Id
  private Integer id;
  
  @ManyToMany(mappedBy = "tags")
  private List<Post> posts = new ArrayList<>();

  // getters and setters
}

Suppose a post(id = 1) has two tags(id = 1, 2). When adding a new tag(id = 3) to the post, JPA may issue the following SQL:

DELETE FROM post_tag WHERE post_id = 1;
INSERT INTO post_tag(post_id, tag_id) VALUES (1, 1);
INSERT INTO post_tag(post_id, tag_id) VALUES (1, 2);
INSERT INTO post_tag(post_id, tag_id) VALUES (1, 3);

While inserting a single record is enough, it at first deletes all existing association records, and then creates all "new" association records. Same problems applies when deleting tags from the post.

It's quite inefficient. And if you have additional columns ("createdAt" for example) in the association table, or you have some triggers based on inserting/deleting association records, you may encounter some problems.

How to solve this problem? There may be two solutions.

Read more >>

My own blog built with Spring Boot

I have been planning to build my own blog (previously it was powered by WordPress) for a long time, and now it finally comes.

The new blog is build with Spring Boot.

Why Spring Boot? Because I have switched to Java for several months. In addition to powering my blog site, the new blog also serves as a learning project.

In addition to Spring Boot, it also uses the following technologies or services:

Source code is available on GitHub.

The features are quite limited for now, but it will be iterated gradually.

Read more >>

Deploy Rails appliction to sub-uri

Sometimes you may want to deploy a Rails application to sub-uri. But it's not seamless if you didn't write your code carefully. The main problem is absolute url.

Avoid absolute url

By default, Rails application only work under root uri unless properly configured. And (maybe) most developers are supposing that the application will be deployed under root uri when writing code. So absolute url is widely used, and they break when deployed under sub-uri. So we should avoid using absolute url. Use relative url or url helpers instead. The are two types of url in Rails: asset url and route url.

Asset url

Since most Rails applications are using Rails Asset Pipeline to manage assets, asset urls are usually generated by asset url helpers like asset_path, asset_url, image_path, etc. These helpers will take care of sub-uri for you if you have configured properly. (See AssetUrlHelper for more helpers and their usage. These helpers are available in view layer. To use them in other places, prefix them with ActionController::Base.helpers.) However, if you put assets directly under public folder, then it's your own responsibility to take care of sub-uri. In such case, use relative url or prefix with Rails.configuration.relative_url_root.

Asset Pipeline

You must set RAILS_RELATIVE_URL_ROOT environment variable when precompiling assets, otherwise url generated by url helpers (such as image-url in scss) would not include sub-uri.

RAILS_RELATIVE_URL_ROOT=/sub-uri bundle exec rake assets:precompile

If you use some deployer (such as mina, capistrano) to automatically precompile assets in production environments, then you should refer to the deployer's documentation about how to pass environment variables to precompiling task. For mina 0.3.x, you can set env_vars in your deploy.rb:

set :env_vars, 'RAILS_RELATIVE_URL_ROOT=/sub-uri'
Read more >>

Let's Encrypt SSL证书申请及配置

lets-encrypt

HTTPS的必要性

  1. 保护用户隐私、账号安全 HTTPS在客户端和服务器之间传输加密内容,即使被窃听,也极难解密;而HTTP明文传输。 对于用户登录操作,使用HTTP很难保证用户的账号安全:若明文传输,攻击者很容易窃听;若使用固定的加密算法,攻击者虽难以得到明文密码,但能够通过重放攻击假冒用户登录。

  2. 防止被劫持 在国内运营商劫持、挂广告异常猖獗的网络环境下,普及HTTPS非常必要。 注意,要防止被劫持,必须全站都上HTTPS,不加载任何非HTTPS的资源,否则还是可以被劫持。

    hijack

HTTPS不能保证绝对的安全,但能极大地提高攻击/劫持的门槛和代价,这足矣。

证书

要部署HTTPS,需要证书。证书主要包含一对公钥和私钥,用来加密客户端和服务器之间传输的内容。 任何人都可以生成证书,但只有权威证书颁发机构(Certificate Authority,简称CA)颁发的证书才会被主流浏览器所信任。CA负责对申请者进行审核,对证书的安全性做担保。 证书按验证等级主要分为三类:

  1. 域名验证(Domain Validation,简称DV),颁发时只验证域名所有权,任何人(包括坏人)都可以申请。申请简单,费用较低,很适合个人网站、中小企业;

  2. 机构验证(Organization Validation,简称OV),需验证域名所有权,且申请机构是一个合法的实体组织;

  3. 扩展验证(Extended Validation,简称EV),CA需要对申请者进行更复杂的审核和认证。对于EV证书,浏览器通常会将地址栏显示为绿色,并显示证书所有者名称。因此电商、支付等领域通常使用这种证书。

    EV Certificate

由于存在审核、审计、担保等成本,申请证书通常是收费的,一些EV证书甚至高达数千美元/年。

Let's Encrypt

Let's Encrypt是一个免费、自动化、开放的证书颁发机构,由网络安全研究小组(Internet Security Research Group,简称ISRG)运作。 ISRG是一个关注网络安全的公益组织,主要赞助商包括Mozilla基金会、Akamai、思科、电子前哨基金会(Electronic Frontier Foundation,简称EFF)、Facebook、IdenTrust、互联网协会(Internet Society)等,参与者还有密歇根大学、斯坦福法学院、Linux基金会等。 Let's Encrypt致力于扫除资金、服务器配置等障碍,以使加密连接成为互联网的标配。 Let's Encrypt的关键原则是:

Read more >>

使用Capistrano部署Rails应用

Capistrano是一个远程服务器自动化工具,使用Ruby编写。 它是一个通用类型的工具,并非专为Rails设计,但对Rails支持很好。

优点

  • 无需对服务器进行任何配置
  • 一键部署、回滚
  • 易于扩展,可方便地添加自定义任务

基本原理

预先定义一系列任务(本质上都是shell脚本),然后通过ssh在远程服务器上执行任务。 使用钩子定义任务的执行时间、顺序。 可创建多套部署方案,capistrano中称为stage,每个stage有一个单独的配置文件。

基本流程

假定服务器上使用Nginx + Passenger + Rvm运行Rails应用,具体安装配置参见各软件文档。 部署Rails应用的基本流程是:

  1. 将代码推送到远程仓库
  2. 本地执行capistrano部署命令
  3. capistrano通过ssh连接远程服务器,从远程代码仓库拉取代码,然后执行安装依赖、编译资源文件、迁移数据库等任务
  4. 重启passenger使新代码生效

基本要求

服务器

Read more >>

Hello world!

终于搭建了自己的博客,使用Wordpress。 一直想搭建一个,但一直没找到喜欢的博客平台。 我理想的博客平台是这样的:

  1. 功能简洁,有文章、标签云、评论、搜索功能即可
  2. 文章编辑使用Markdown格式
  3. 数据保存在数据库,而非文件系统

最近已着手自己编写一个,代码托管在GitHub,使用Ruby On Rails框架。除了自己用,也作为学习Rails的一个练手项目。 因急用所以先用Wordpress搭建了一个,自己的博客网站实现后再迁移。

Read more >>