SimpleDateFormat.parse()忽略模式中的字符数
我试图parsing一个datestring,可以有不同的格式树。 即使该string不应该匹配第二个模式它不知何故,因此返回一个错误的date。
这是我的代码:
import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class Start { public static void main(String[] args) { SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy"); try{ System.out.println(sdf.format(parseDate("2013-01-31"))); } catch(ParseException ex){ System.out.println("Unable to parse"); } } public static Date parseDate(String dateString) throws ParseException{ SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy"); SimpleDateFormat sdf2 = new SimpleDateFormat("dd-MM-yyyy"); SimpleDateFormat sdf3 = new SimpleDateFormat("yyyy-MM-dd"); Date parsedDate; try { parsedDate = sdf.parse(dateString); } catch (ParseException ex) { try{ parsedDate = sdf2.parse(dateString); } catch (ParseException ex2){ parsedDate = sdf3.parse(dateString); } } return parsedDate; } }
随着input2013-01-31
我得到输出05.07.0036
。
如果我试图parsing31-01-2013
或31.01.2013
我得到31.01.2013
预期。
我认识到,如果我设置这样的模式,程序会给我完全相同的输出:
SimpleDateFormat sdf = new SimpleDateFormat("dMy"); SimpleDateFormat sdf2 = new SimpleDateFormat("dMy"); SimpleDateFormat sdf3 = new SimpleDateFormat("yMd");
为什么它会忽略我的模式中的字符数?
SimpleDateFormat有几个严重的问题。 默认的宽松设置可以产生垃圾回答,我不能想到一个宽大的任何好处的情况下。 这不应该是默认设置。 但是,放宽宽松只是解决scheme的一部分。 您仍然可以结束在testing中难以捕捉的垃圾结果。 有关示例,请参阅下面的代码中的注释。
这是SimpleDateFormat的一个扩展,强制严格模式匹配。 这应该是该类的默认行为。
import java.text.DateFormatSymbols; import java.text.ParseException; import java.text.ParsePosition; import java.text.SimpleDateFormat; import java.util.Date; import java.util.Locale; /** * Extension of SimpleDateFormat that implements strict matching. * parse(text) will only return a Date if text exactly matches the * pattern. * * This is needed because SimpleDateFormat does not enforce strict * matching. First there is the lenient setting, which is true * by default. This allows text that does not match the pattern and * garbage to be interpreted as valid date/time information. For example, * parsing "2010-09-01" using the format "yyyyMMdd" yields the date * 2009/12/09! Is this bizarre interpretation the ninth day of the * zeroth month of 2010? If you are dealing with inputs that are not * strictly formatted, you WILL get bad results. You can override lenient * with setLenient(false), but this strangeness should not be the default. * * Second, setLenient(false) still does not strictly interpret the pattern. * For example "2010/01/5" will match "yyyy/MM/dd". And data disagreement like * "1999/2011" for the pattern "yyyy/yyyy" is tolerated (yielding 2011). * * Third, setLenient(false) still allows garbage after the pattern match. * For example: "20100901" and "20100901andGarbage" will both match "yyyyMMdd". * * This class restricts this undesirable behavior, and makes parse() and * format() functional inverses, which is what you would expect. Thus * text.equals(format(parse(text))) when parse returns a non-null result. * * @author zobell * */ public class StrictSimpleDateFormat extends SimpleDateFormat { protected boolean strict = true; public StrictSimpleDateFormat() { super(); setStrict(true); } public StrictSimpleDateFormat(String pattern) { super(pattern); setStrict(true); } public StrictSimpleDateFormat(String pattern, DateFormatSymbols formatSymbols) { super(pattern, formatSymbols); setStrict(true); } public StrictSimpleDateFormat(String pattern, Locale locale) { super(pattern, locale); setStrict(true); } /** * Set the strict setting. If strict == true (the default) * then parsing requires an exact match to the pattern. Setting * strict = false will tolerate text after the pattern match. * @param strict */ public void setStrict(boolean strict) { this.strict = strict; // strict with lenient does not make sense. Really lenient does // not make sense in any case. if (strict) setLenient(false); } public boolean getStrict() { return strict; } /** * Parse text to a Date. Exact match of the pattern is required. * Parse and format are now inverse functions, so this is * required to be true for valid text date information: * text.equals(format(parse(text)) * @param text * @param pos * @return */ @Override public Date parse(String text, ParsePosition pos) { int posIndex = pos.getIndex(); Date d = super.parse(text, pos); if (strict && d != null) { String format = this.format(d); if (posIndex + format.length() != text.length() || !text.endsWith(format)) { d = null; // Not exact match } } return d; } }
它被logging在SimpleDateFormat
javadoc中:
对于格式化,模式字母的数量是最小位数,较短的数字是零填充到这个数量。 对于parsing,模式字母的数量将被忽略,除非需要分隔两个相邻的字段。
解决方法是用正则expression式testingyyyy-MM-dd格式:
public static Date parseDate(String dateString) throws ParseException { SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy"); SimpleDateFormat sdf2 = new SimpleDateFormat("dd-MM-yyyy"); SimpleDateFormat sdf3 = new SimpleDateFormat("yyyy-MM-dd"); Date parsedDate; try { if (dateString.matches("\\d{4}-\\d{2}-\\d{2}")) { parsedDate = sdf3.parse(dateString); } else { throw new ParseException("", 0); } } catch (ParseException ex) { try { parsedDate = sdf2.parse(dateString); } catch (ParseException ex2) { parsedDate = sdf.parse(dateString); } } return parsedDate; }
谢谢@Teetoo。 这帮助我find解决我的问题的方法:
如果我想要parsing函数完全匹配模式,我必须将SimpleDateFormat.setLenient
“lenient”( SimpleDateFormat.setLenient
)设置为false
:
SimpleDateFormat sdf = new SimpleDateFormat("dMy"); sdf.setLenient(false); SimpleDateFormat sdf2 = new SimpleDateFormat("dMy"); sdf2.setLenient(false); SimpleDateFormat sdf3 = new SimpleDateFormat("yMd"); sdf3.setLenient(false);
这仍然会parsingdate,如果我只为每个细分市场使用一个模式字母,但它会认识到,2013年不能是一天,因此它不符合第二种模式。 与长度检查相结合,我准确地收回我想要的。