正则表达式高级替换技巧

 

本文着重于实际开发中如何使用正则表达式的替换技巧来提高工作效率,相关知识储备不做详细介绍。

演示环境:macOS Catalina 10.15.7; idea 2019+; wps表格;

常用正则表达式搜索关键字

关键字 作用
^、& 开头;结尾
. 任意字符
[xyz] 、[^xyz] 匹配[]里面其中任意一个; 取前面的反集,即不匹配[]里面任何一个
\d 数字字符 [0-9]
\w 任意一个字母或数字或下划线 [A~Za~z0~9]
\s 匹配任何空白字符 [\f\n\r\t\v]
\D、\W、\S 代表和对应小写相反的内容 [^0-9] [^A~Za~z0~9] [^ \f\n\r\t\v]
(xx) 圈定xx内容为一个分组
?、+、* 数量匹配,分别为 0或1[01];1以上 [1,]; 任意个[0,]
(?=xx) 、(?!xx) 后面为xx;后面不为xx
(?<=xx) 、(?<!xx) 前面为xx;前面不为xx
\b 匹配单词边界

常用正则表达式替换关键字

| 关键字 | 作用 | | — | — | | $x | 取第x个分组 $1代表第一个分组 | | \u 和 \U | 后续一个转为大写;后续所有转为大写 | | \l 和 \L | 后续一个转为小写;后续所有转为小写 | | \E | 结束上述 \U \L 的转换 |

基本流程

分析梳理,查找,分组圈定,替换

需求实操

一、model转换类 (sql、model、pb、vo 转换 以及他们之间的get&set )

从db里面查数据,用model接收,转成pb通过rpc传递给api,api再转成前端交互vo

1、下划线转驼峰

//源代码:
id
activity_id
schedule_id
author_id
status
settle_begin_time
settle_end_time
create_time
update_time
data
//目标代码:
id
activityId
scheduleId
authorId
status
settleBeginTime
settleEndTime
createTime
updateTime
data
  • 分析梳理: 很简单,就是把_去掉,然后把_后面的字母变成大写
  • 查找: _和它后面的一个字母 即 _\w
  • 分组圈定: 因为需要保留后面的字母,所以需要将其圈住 _(\w)
  • 替换: 取组1的数据,将其装换成大写 \u$1
  • 效果展示

2、驼峰转下划线

//源代码:
  同上目标代码
//目标代码:
  同上源代码
  • 分析梳理: 和前面相反,找到前面是小写的大写字母,插入_然后将大写转小写即可,记得需要开启IDEA的大小写敏感
  • 查找: 前面是小写的大写字母:(?<=[a-z])[A-Z]
  • 分组圈定: (?<=[a-z])([A-Z])
  • 替换: 取组1的数据,插入_,转成小写: _\l$1
  • 效果展示

3、驼峰转 get&set

//源代码:
id
activityId
scheduleId
authorId
status
settleBeginTime
settleEndTime  
createTime
updateTime
data
//目标代码:
target.setId(origin.getId());
target.setActivityId(origin.getActivityId());
target.setScheduleId(origin.getScheduleId());
target.setAuthorId(origin.getAuthorId());
target.setStatus(origin.getStatus());
target.setSettleBeginTime(origin.getSettleBeginTime());
target.setSettleEndTime(origin.getSettleEndTime());
target.setCreateTime(origin.getCreateTime());
target.setUpdateTime(origin.getUpdateTime());
target.setData(origin.getData());
  • 分析梳理: 找到每行的首字母,大写,然后填充get和set
  • 查找: 首字母和后面的字母: \w\w+
  • 分组圈定: 首字母和后面的需要分开成2组做不同的处理: (\w)(\w+)
  • 替换: get方面后面先大写首字母,然后取后面的字母: target.set\u$1$2(origin.get\u$1$2());
  • 效果展示

4、sql 转 model

//源代码:
    `id`                        BIGINT(20)   UNSIGNED NOT NULL COMMENT '主键',
    `activity_id`               BIGINT(20)   UNSIGNED NOT NULL COMMENT '活动id',
    `schedule_id`               bigint(20)   UNSIGNED NOT NULL COMMENT '周期id',
    `author_id`                 BIGINT(20)   UNSIGNED NOT NULL COMMENT '主播id',
    `status`                    TINYINT(3)   UNSIGNED NOT NULL COMMENT '当前状态 1:发出邀请2:接受邀请3:拒绝邀请4:任务已创建 5:已结算可提现',
    `settle_begin_time`         BIGINT(20)   UNSIGNED NOT NULL COMMENT '结算开始时间',
    `settle_end_time`           BIGINT(20)   UNSIGNED NOT NULL COMMENT '结算结束时间',
    `create_time`               BIGINT(20)   UNSIGNED NOT NULL COMMENT '创建时间',
    `update_time`               BIGINT(20)   UNSIGNED NOT NULL COMMENT '更新时间',
    `data`                      JSON                  NOT NULL COMMENT '其他数据',
//目标代码:
    private long id; //主键
    private long activityId; //活动id
    private long scheduleId; //周期id
    private long authorId; //主播id
    private int status; //当前状态 1:发出邀请,2:接受邀请,3:拒绝邀请,4:任务已创建 5:已结算可提现
    private long settleBeginTime; //结算开始时间
    private long settleEndTime; //结算结束时间
    private long createTime; //创建时间
    private long updateTime; //更新时间
    private String data; // 其他数据
  • 分析梳理: 我们可以分3步来完成,第一步提取必要字段,第二步变更类型,第三步下划线转驼峰。而需要保留的字段有:列名,类型,注释,很幸运必要的字段都有特征符:`和’;
  • 查找: ``包围了列名,`后面的空字符完了第一个单词是类型,而’‘包围了注释: `\w+`\s+\w+.*’(.*)’.*
  • 分组圈定: 圈住必要的3个字段就行: `(\w+)`\s+(\w+).*’(.*)’.*
  • 替换: 先替换成必要数据: private $2 $1; //$3,然后替换类型 BIGINT—>long、TINYINT->int即可,然后完成下划线转成驼峰
  • 效果展示

5、sql 转 pb (结合excel的骚操作)

以sql举例而不用model举例,因为符合下划线的格式,就不做驼峰转下划线的赘述了

//源代码:
    同上
//目标代码:
    int64 id = 1; //主键
    int64 activity_id = 2; //活动id
    int64 schedule_id = 3; //周期id
    int64 author_id = 4; //主播id
    int32 status = 5; //当前状态 1:发出邀请,2:接受邀请,3:拒绝邀请,4:任务已创建 5:已结算可提现
    int64 settle_begin_time = 6; //结算开始时间
    int64 settle_end_time = 7; //结算结束时间
    int64 create_time = 8; //创建时间
    int64 update_time = 9; //更新时间
    string data = 10; //其他数据
  • 分析梳理: pb 定义有个麻烦的要求就是添加编号,但是好在有excel,拉下去就能生成编号了。然后找到变量、类型、注释这三个元素,再进行拼装即可
  • 查找: 和上面一样 `\w+`\s+\w+.*’(.*)’.* 再加上数字编号
  • 分组圈定: 也同上,然后加上数字编号
  • 替换: $2 $1 = 数字编号; //$3
  • 效果展示
    • excel生成编号
    • 替换

二、运营配置类

运营给出配置文档,然后落地到项目的配置中,如 kconf 的 json

1、配置每小时红包预算

//源代码:

//复制出来
时段	5.16	5.17	5.18	5.19	5.2	5.21	5.22
0:00-1:00	98	40	67	86	38	20	72
1:00-2:00	67	6	141	110	142	108	69
2:00-3:00	8	78	129	128	6	128	29
3:00-4:00	52	72	72	72	10	57	70
4:00-5:00	25	54	146	91	138	66	29
5:00-6:00	114	94	126	139	110	77	118
6:00-7:00	44	81	57	128	78	120	117
7:00-8:00	135	110	53	23	149	138	7
8:00-9:00	81	54	34	37	118	128	54
9:00-10:00	34	141	72	38	112	98	37
10:00-11:00	114	102	142	128	114	120	124
11:00-12:00	70	67	78	78	86	99	25
12:00-13:00	86	89	78	54	29	14	99
13:00-14:00	44	78	4	70	52	126	113
14:00-15:00	64	37	138	57	100	78	81
15:00-16:00	99	110	64	117	87	120	128
16:00-17:00	146	146	112	116	37	77	66
17:00-18:00	110	23	52	149	100	146	67
18:00-19:00	23	112	11	37	54	110	10
19:00-20:00	145	107	113	7	89	78	94
20:00-21:00	52	78	78	25	114	114	80
21:00-22:00	69	116	4	114	10	2	53
22:00-23:00	14	78	117	94	54	78	37
23:00-24:00	141	38	67	15	34	100	45
//目标代码:
{
  "dateStr2hourBudgetConfigMap": {
    "20210610": {
      "hourBudgetMap": {
        "0":980000,
        "1":670000,
        "2":80000,
        "3":520000,
        "4":250000,
        "5":1140000,
        "6":440000,
        "7":1350000,
        "8":810000,
        "9":340000,
        "10":1140000,
        "11":700000,
        "12":860000,
        "13":440000,
        "14":640000,
        "15":990000,
        "16":1460000,
        "17":1100000,
        "18":230000,
        "19":1450000,
        "20":520000,
        "21":690000,
        "22":140000,
        "23":1410000
      }
    },
    "20210611": {
      "hourBudgetMap": {
        "0":400000,
        "1":60000,
        "2":780000,
        "3":720000,
        "4":540000,
        "5":940000,
        "6":810000,
        "7":1100000,
        "8":540000,
        "9":1410000,
        "10":1020000,
        "11":670000,
        "12":890000,
        "13":780000,
        "14":370000,
        "15":1100000,
        "16":1460000,
        "17":230000,
        "18":1120000,
        "19":1070000,
        "20":780000,
        "21":1160000,
        "22":780000,
        "23":380000
      }
    },
    //省略
    "20210612": {"hourBudgetMap": {}},
    "20210613": {"hourBudgetMap": {}},
    "20210614": {"hourBudgetMap": {}},
    "20210615": {"hourBudgetMap": {}},
    "20210616": {"hourBudgetMap": {}}
  }
}
  • 分析梳理: 第一步先精简小时格式,第二步替换每个节点的json字段格式,第三步按照日期进行复制
    1. 精简日期 F: ^(\d+):\S+ R: $1
    2. 组装json字段 F: ^(\d+)(.*?\s)(\b\d+\b)(?=\s) R: $1$2”$1”:$30000,       (注意最后的空格)

以上两步得到的结果: 然后把每列记录贴到空白模版即可:

优势总结

  • 操作花哨,身心愉悦
  • 提高效率,有时间撸更多代码
  • 不会错乱,只要正则写对,结果没有错误

简单小练习

kdb 一次性查询n张表

select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_0 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_2 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_4 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_6 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_8 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_10 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_12 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_14 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_16 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_18 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_20 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_22 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_24 where activity_id = 2021062703 and  schedule_id = 21 group by status union all 
select status,count(status) as count1 from live_magic_box_kickback_lifecycle_record_26 where activity_id = 2021062703 and  schedule_id = 21 group by status union all
# 省略30